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Articulation Test Measures and 


Listener Ratings of Articulation Defectiveness 


EVAN P. JORDAN 


The present study is concerned with 
an analysis of the relationships between 
(a) certain factors associated with de- 
fective articulation and (b) listener re- 
action as indicated by listener ratings of 
the severity of defective articulation in 
short samples of children’s speech. 

It is generally recognized that defec- 
tive articulation is likely to have an 
adverse effect on the personality of 
the speaker due, at least in part, to the 
reactions of his listeners. It is generally 
urged, therefore, that therapy be in- 
itiated as soon as possible following 
recognition of the defect. One of the 
major goals of therapy, then, is to re- 
duce as quickly as possible and as much 
as possible those deviations which are 
distracting to the listener. To accom- 
plish this most effectively, the clinician’s 
diagnosis must enable him to identify 
those factors contributing the most to 
making the speech defective. That is, 
he must decide what characteristics of 
the defective articulation most distract 
the listener. Articulation testing which 
does not provide this information has 
limited usefulness. 





Evan P. Jordan (Ph.D., University of Iowa, 
1960) is Assistant Professor of Speech Pathol- 
ogy, Department of English, Colorado State 
University. This article is based in part on a 
Ph.D. dissertation completed in 1960 under 
the direction of Dr. Dorothy Sherman. The 
research was supported by a grant from the 
American Speech and Hearing Foundation. 
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Review of Literature 


Number of Sounds. One factor which 
might reasonably be supposed to relate 
to listener reaction is the number of 
speech sounds misarticulated. Perrin 
(11) found a high correlation between 
number of articulation errors and listen- 
er judgments of severity of articulatory 
defect. Bangs (1) used the number of 
defective sounds in investigating the re- 
lationship between intelligence and 
speech development. Roe and Milisen 
(12) used this manner of describing 
articulation defects quantitatively and 
investigated the changes in articulation 
with age. Templin (/8) compared two 
methods of articulation testing using 
scores based on percentages of speech 
sounds correctly produced. 


Frequency and Position of Sounds. 
Wood (24) assumed that misarticulated 
sounds which occur frequently in the 
spoken language distract the listener 
more than those which occur infre- 
quently. He proposed a method of fre- 
quency weighting of misarticulated 
sounds for quantifying social adequacy 
of connected speech. 

Henrikson (8) suggested that, since 
a sound does not occur with equal fre- 
quency in each of the three conven- 
tional speech sound positions, equal 
weighting over the initial, medial, and 
final positions is not justified. 
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Stetson (16) pointed out that there 
are no bases in the physiology of con- 
nected speech for such terms as initial, 
medial, and final sounds. His investiga- 
tions revealed that connected speech 
consists of series of syllables and that 
consonant sounds act to release or arrest 
the syllables. 

Barker (2) made use of three sets of 
frequency weights, one set each for the 
relative incidence of each consonant 
sound in the initial, medial, and final 
position, respectively. Using these sets 
of weights she computed an articulation 
score for each subject. She constructed 
two additional sets of frequency 
weights for each consonant as a syllable 
initiating sound and as a syllable termi- 
nating sound. An articulation score was 
computed using these weights. ‘The cor- 
relation between listener judgments of 
severity of articulatory defect and each 
of the two sets of articulation scores 
was high. 


Listener Reaction. Wright (25) rec- 
ognized that speech sound-errors proba- 
bly vary in degree of defectiveness, or 
listener distractibility, and described a 
method for scaling the magnitude of 
speech sound-errors with the number 
one representing a correctly articulated 
sound; two through five, progressive 
amounts of distortion of the sound; six, 
a substitution; and seven, an omission. 
The rationale for this scale is based in 
part on the finding of Roe and Milisen 
(12) that, in general, as articulation 
skills develop, sounds are likely to be 
first omitted, then distorted, and final- 
ly produced correctly. Such an order- 
ing of sound-errors may also involve 
the assumption that, in general, listeners 
will be distracted more by omissions 
than by substitutions and more by sub- 


stitutions than by distortions. These as- 
sumptions, while tenable, have not been 
tested experimentally. 

Milisen (9) described a method for 
predicting the relative distracting effect 
of each defective sound through a joint 
consideration of the relative frequency 
of the sound and the magnitude of the 
error (as defined by Wright) in its 
articulation. 

For some purposes adequacy of artic- 
ulation or seriousness of articulatory de- 
fect can be described meaningfully only 
in terms of listener reactions. The 
validity of some of the quantitative de- 
scriptions of articulation defects de- 
scribed above is therefore open to 
question. Only two of the investigators, 
Perrin and Barker (11, 2), have experi- 
mentally related their measures of artic- 
ulatory defectiveness to the reactions 
of listeners. Perrin’s results, however, 
are based on an analysis of speech sam- 
ples from only seven children and al- 
though Barker’s results are based on 


‘measures obtained from the responses 


of 45 children, there seems some likeli- 
hood that her sample included few, if 
any, children misarticulating large num- 
bers of sounds. 

The usual articulation test yields a 
number of possible cues to listener re- 
action: the consistency with which 
speech sound-errors are made, the posi- 
tion (initial, medial, or final) in which 
the errors occur, the function of the 
sound in running speech (releasing or 
arresting), the phonetic category to 
which the misarticulated sounds belong 
(such as plosive or fricative), and the 
type of error with respect to whether 
the misarticulations involve sounds oc- 
curring as singles or in blends. Also to 
be considered is information associated 
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with testing, such as age of the child. 
It seems reasonable to assume that each 
of these factors might relate important- 
ly to the severity of articulation defects 
when severity is defined in terms of 


, the reactions of listeners. The relation- 
ships between these factors and severity, 


however, have not been experimentally 
investigated. 

In relatively few reported studies 
have listener reactions to defective ar- 
ticulation been evaluated directly rather 
than from articulation test data. Among 
these are two of particular importance 
to the purposes of the present study. 
Morrison (10) obtained highly reliable 
scale values of articulation defectiveness 
based upon responses of listeners to the 
connected speech of 50 children with 
no important differences between re- 
sults for naive listeners and results for 
sophisticated listeners. Sherman and Cul- 
linan (13) found close agreement be- 
tween scale values based upon listener 
responses to one-minute speech samples 
and 10-second segments from the one- 
minute samples. These two studies pro- 
vide strong evidence that reliable 
quantification of listener reactions to 
defective articulation in very short 
speech samples can be accomplished and 
with the use of either naive or sophisti- 
cated listeners. It is also evident that 
scale values for very short samples are 
as useful for many experimental pur- 
poses as are scale values for longer 
samples. 


Problem 

The purpose of the present study 
was to evaluate relationships between 
measures of defectiveness of articulation 
obtained from articulation test responses 
and those obtained from listener ratings 
of short samples of connected speech. 


The measures obtained from analysis 
of the articulation test responses con- 
cerned number of speech sounds de- 
fective; frequency in the language of 
these sounds; phonetic consistency of 
the speech sound-errors; type of sound- 
error (omission, substitution, distor- 
tion); position and function of the 
consonant sounds misarticulated; pho- 
netic category of sounds misarticulated; 
sounds misarticulated in blends. Also 
considered was the age of each child. 


Procedure 

General Method. Primarily two types 
of data were obtained: (a) scale values 
of articulation defectiveness derived 
from responses of listeners to samples 
of continuous speech and (b) measures 
of various aspects of misarticulation de- 
rived from results of articulation test- 
ing. A multiple regression analysis was 
employed. The dependent variable is 
scaled severity of articulation defective- 
ness; the independent variables are 22 
measures obtained from articulation test 
responses and also the age of each sub- 
ject. 

Subjects were 150 children with mild 
to severe articulation deviations. They 
were selected from speech and hearing 
clinics of three universities and colleges 
and five public schools. No subject had 
experienced adolescent voice change or 
displayed any pronounced deviation of 
voice quality or rhythm, or any marked 
immaturity of language development. 
Tape-recorded speech samples were se- 
lected according to these criteria by 
unanimous agreement among three ex- 
perienced speech correctionists. 


Dependent Variable. Sampling of 
Connected Speech. A tape-recorded 30- 
second sample of the connected speech 
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of each child was obtained while that 
child related a recent experience or a 
story. Tape speed was 15 inches per 
second. 

Construction of Experimental Tapes. 
The 150 30-second samples of con- 
tinuous speech were arranged in a 
random order and dubbed onto new 
tape. During the dubbing process, gross 
intensity differences between samples 
were eliminated. The 30-second sample 
length was chosen to ensure that the 
listeners had a sufficient sample of each 
child’s running speech on which to 
base judgments of severity (13). 


Severity Scaling. The recorded speech 
samples were rated by 36 advanced 
undergraduate and graduate students 
majoring in speech pathology. The scale 
was a nine-point, equal-appearing inter- 
vals scale which extended from one, 
representing least defective articulation, 
to nine, representing most defective ar- 
ticulation. The method has been dem- 
onstrated to be reliable and practicable 
for scaling articulation (10, 13, 14). 

Instrumentation. The instrumentation 
for the listening session consisted of an 
Ampex tape playback, Model 350, a 
McIntosh amplifier, Model 20W2, and 
a Jensen duax loudspeaker. Listening 
took place in a sound-treated room. 


Experimental Listening Session. Im- 
mediately before scaling the samples, 
the listeners were given a short training 
session. The instructions for scaling 
were read and questions answered. Fol- 
lowing this, a short training tape con- 
sisting of 11 30-second samples of the 
running speech of children was pre- 
sented. The first three samples had been 
selected as representative of the two 
extremes and the midvalue of articula- 
tory defectiveness from among the 


samples obtained by Morrison (10). 
These samples were presented to the 
listeners twice. The remaining five sam- 
ples, which had been selected from 
among Morrison’s 50 samples to repre- 
sent a range of articulatory defective- 
ness, were rated. The experimenter then 
announced the previously determined 
scale values for comparison. Scaling of 
the experimental samples required ap- 
proximately two and a half hours, in- 
cluding three five-minute rest periods. 


Reliablity of Scaling. To evaluate re- 
liability of obtained scale values, 20 
samples were selected at random from 
among those scaled by Morrison and a 
30-second portion from each was in- 
serted randomly in the experimental 
tape. The 20 scale values derived from 
responses of listeners in the present 
experiment were correlated with cor- 
responding scale values already estab- 
lished. A Pearson r of .94 indicated 
satisfactory reliability. 


Independent Variables. Articulation 
Testing. On the same day that his con- 
nected speech was sampled, each child’s 
articulation was tested with the Temp- 
lin-Darley (19) 176-item diagnostic ar- 
ticulation test. This test elicits, by means 
of pictures and questions, the consonant 
sounds in all positions in which they 
appear as singles, most of the common 
blends, the single vowels, and the diph- 
thongs. The speech sound tested in each 
response was classified and recorded as 
being correct, distorted, substituted, or 
omitted. In the case of blends, each 
consonant element composing the blend 
was considered as being under test and 
an evaluation of the production of each 
element was recorded. There are 282 
presentations of sounds in the Templin- 
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Darley test when blends are considered 
as described above. 


Reliability of Articulation Testing. 
Five children were selected at random 
from the 150 children. The responses of 
each of these children to the articula- 
tion test pictures were tape recorded 
at the same time that his articulation was 
tested. Several months following the 
initial test the examiner tested the artic- 
ulation of each one of these children 
by playing back the tape-recorded re- 
sponses and evaluating the sounds being 
tested in each response; two weeks later 
he repeated the same procedure. Two 
other experienced clinicians also tested 
the articulation of the five children 
by playing back the tape-recorded re- 
sponses and evaluating them. A coeffi- 
cient of contingency (C) was computed 
for each of the comparisons, the ex- 
perimenter with himself and the experi- 
menter with each of the other two 
clinicians. The three C coefficients made 
possible an evaluation of the intraob- 
server as well as interobserver reliabili- 
ty. When using four categories (correct 
sound, distortion, substitution, and omis- 
sion), the maximum C coefficient ob- 
tainable is .866. The C obtained when 
the experimenter’s ‘test results were 
compared with his retest results was 
804. The C coefficients indicating the 
extent of agreement between the ex- 
perimenter’s test results and the results 
obtained by each of the other two 
testers were .729 and .756. 


Number of Speech Sounds Defective. 
Three measures of number of defective 
sounds were obtained. Each measure or 
method of counting is in fairly common 
use and a comparison of the relative 
strengths of their relationships to 
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severity would be useful in selecting 
one for clinical use. 

The first measure was a simple count 
of the number of items misarticulated 
out of the total of 176 items on the test. 
Each child’s score was the number of 
items he had misarticulated. Such an 
item count was used by Templin (17) 
in specifying cut-off scores which sepa- 
rate adequate from inadequate articula- 
tion at various age levels on a diag- 
nostic test of articulation. 

The second measure was a count of 
the number of defective sounds. In 
computing this measure, a sound was 
considered to be defective if it was 
misarticulated in any position as a single 
or in a blend. Consonant and vowel 
productions of /r/, /l/, and /m/ were 
considered as separate sounds.* Each 
child’s score was the number of dif- 
ferent sounds he had misarticulated. 
The total possible number of defective 
sounds counted in such a manner is 45. 
This count is probably most similar to 
those used clinically where fairly ex- 
tensive testing of blends is routine, and 
the assumption apparently underlying 
articulation testing is that speech sound- 
errors tend to vary with phonetic en- 
vironment and, therefore, tend to be 
inconsistent. 


*Both /r/ and /1/ were considered vowels 


if they acted as off-glides or were syllabic; 
/m/ was considered a vowel only if it was 
syllabic. No test item contained /n/ as a 
syllabic element. Separating the vowels and 
consonant sounds in this way relates to 
differences in the physiology of the produc- 
tion of these sounds in connected discourse. 
Although no experimental evidence is avail- 
able concerning /l/, /m/, and /n/, data 
obtained by Hardy (7) on children’s mis- 
articulations of the various /r/ sounds in- 
dicate that vowel and consonant /r/ sounds 
involve different kinds or degrees of articu- 
lation skills. 
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In deriving the third measure, a 
sound was considered defective only 
when misarticulated in at least one posi- 
tion as a single. Sounds produced cor- 
rectly as singles but misarticulated in 
blends were not considered defective 
sounds. Consonant and vowel produc- 
tions of /r/, /1/, and /m/ were counted 
separately. This method of determin- 
ing number of defective sounds is 
reasonably representative of those used 
clinically where blends are not tested 
extensively and where the assumption 
underlying articulation testing seems to 
be that speech sound-errors tend to 
be consistent and fairly independent of 
phonetic environment. 


Frequency of Occurrence of Error- 
Sounds. Several investigators have de- 
rived rank orders of English speech 
sounds supposedly reflecting the fre- 
quency of occurrence of these sounds 
in the spoken language (4, 5, 20, 21). 
Travis (20) has reported an order of 
frequency of occurrence of consonant 
sounds in children’s speech. The investi- 
gations cited, with the exception of 
Travis, have concerned samples drawn 
from the speech or writings of adults. 
Rank difference correlation coefficients, 
based on frequency of occurrence rank- 
ings of sounds, however, for the com- 
parisons of Travis with Dewey (4), 
Travis with French, Carter, and Koenig 
(5), and Travis with Voelker (21) are 
high: .95, .94, and .96, respectively. On 
the basis of these comparisons it would 
seem that any of the rankings is reason- 
ably descriptive of children’s speech, at 
least to the extent that the Travis rank- 
ing is descriptive. 

The striking similarities among these 
speech-sound rankings, despite dis- 
similar sampling and transcribing tech- 


niques, appear to indicate that relative 
frequency of occurrence is a fairly 
stable measure, despite variations in age 
and vocabulary of the speaker and type 
of verbal material sampled. Selection of 
the ranking, therefore, was a fairly 
arbitrary choice. The Travis ranking 
and the French, Carter, and Koenig 
ranking were selected with the inten- 
tion of investigating whether one makes 
possible a measure which relates more 
closely with rated severity of articula- 
tion defect than does the other. 


The first weighting of misarticulated 
sounds by their relative frequency of 
occurrence utilized Travis’ ranking and, 
therefore, involved only the 25 con- 
sonant sounds. Consonant and vowel 
/t/, /\/, and /m/ were not treated 
separately but were considered as mem- 
bers of the same phoneme since they 
were so listed in Travis’ ranking. In 
obtaining this measure of frequency of 
occurrence, only those consonants mis- 
articulated as singles were considered 
‘defective. With this defective-sound 
criterion, it seemed likely that those 
sounds which were misarticulated only 
in particular phonetic environme: is, 
usually in particular blends, would not 
be counted as defective sounds. If so, 
this measure of frequency of occur- 
rence of error-sounds might relate more 
closely to severity than would a meas- 
ure including sounds only occasionally 
misarticulated. 


A weight was given each sound on 
the basis of its relative frequency of 
occurrence in the language as indicated 
by Travis. The /t/, for example, was 
given a weight of 12 because occur- 
rences of /t/ account for 12% of all 
consonant sound occurrences. The fre- 
quency weights of all sounds misarticu- 
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lated were summed for each child with- 
out regard to the number of positions 
in which the sound was misarticulated. 
A /t/ sound, for example, which was 
defective in three sound-positions was 
given the same frequency weight (12) 
as a /t/ sound which was defective in 
only one or two sound-positions. The 
total frequency weight was then di- 
vided by the number of defective 
sounds to yield the mean frequency 
weight for each child. 


In order to obtain a frequency of 
occurrence ranking which included all 
of the sounds presented in the articula- 
tion test, French, Carter, and Koenig’s 
basic data were utilized but their 
speech-sound categories were discarded. 
Relative frequencies of occurrences 
were computed for all consonants, 
vowels, and diphthongs. Consonant and 
vowel productions of /r/, /1/, and /m/ 
were considered separate phonemes, and 
separate frequencies of occurrences 
were computed for each. Weights were 
assigned each sound on the basis of the 
number of times that sound occurred 
in the 80 000-word sample obtained by 
French, Carter, and Koenig. A sound 
was considered defective if it was mis- 
articulated on any of the articulation 
test items designed to test it. The fre- 
quency weights of all sounds misarticu- 
lated were summed for each child, and 
a mean was computed. 


Phonetic Consistency of Errors. Two 
measures of phonetic consistency of 
errors were obtained. The first measure 
utilized the defective-sound criterion 
described in connection with the second 
measure of number of defective sounds; 
that is, a sound misarticulated either as 
a single or in a blend was considered a 
defective sound. Each child’s phonetic- 
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consistency-of-errors score, with re- 
spect to a particular defective sound, 
was derived by computing the propor- 
tion of tested occurrences of that sound 
which were defective. All such pro- 
portions obtained from the data in a 
single articulation test (one for each 
defective sound) were summed and a 
mean computed. This mean of propor- 
tions constituted the child’s phonetic- 
consistency-of-errors score. 

The second measure of phonetic con- 
sistency, which was also a mean of pro- 
portions, was based on a count of those 
sounds misarticulated as singles; con- 
sonant and vowel /r/, /l/, and /m/ 
were considered separate sounds. Since 
diphthongs and vowels, with the ex- 
ception of /r/, /l/, and /m/, were 
tested only once each, measures of the 
phonetic consistency of misarticulation 
of these sounds were not possible. 

Type of Sound-Error. The propor- 
tions of the total sound-errors which 
were omissions, substitutions, and dis- 
tortions, respectively, were computed 
for each child. The three proportions 
constituted each child’s type-of-error 
scores. Sounds were considered defec- 
tive if they were misarticulated on any 
test item designed to test them. 

Position of Consonant Sounds Mis- 
articulated. In obtaining measures for 
positions of consonant sounds misar- 
ticulated, only those consonant sounds 
misarticulated as singles were consid- 
ered defective. The proportions of mis- 
articulations which occurred in the 
initial, medial, and final positions were 
the position-of-consonant-errors scores. 

Function of Consonant Sounds Mis- 
articulated. Considering as defective 
only those consonants misarticulated as 
singles, the proportion of misarticulated 
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test items which were releasing sounds 
and the proportion of misarticulated 
test items which were arresting sounds 
were the function-of-misarticulated- 
consonants scores. 


Phonetic Category of Sounds Mis- 
articulated. The measures of phonetic 
category of sounds misarticulated were 
the proportions of misarticulated sound- 
presentations consisting of (a) vowels 
or diphthongs, (b) nasals, (c) plosives, 
(d) onglides, (e) fricatives and (f) af- 
fricates. Any sound misarticulated on a 
test item designed to test it was con- 
sidered defective. 


Proportion of Defective Sounds Mis- 
articulated in Blends. A measure of the 
proportion of defective sounds mis- 
articulated in blends was derived, based 
on a total including any sound mis- 
articulated on a test item designed to 
test it. 


Results and Discussion 


Statistical Procedure. The major 
method of statistical treatment of the 
data was a multiple regression analysis 
(6, 22). This analysis had as its goal 
the selection of those independent vari- 
ables making important contributions 
to the multiple correlation and the 
elimination of those independent vari- 
ables which do not relate, singly or in 
company, with other variables, to 
judged severity of running speech. The 
anticipated end-product of this analysis, 
then, was the identification of those 
relatively few variables which in com- 
pany with one another relate highly to 
judged severity. 


Because of the extremely large num- 
ber, 24, of variables involved, the analy- 
sis could be accomplished only through 


the use of an electronic computer.? The 
following statistics were computed: 
means, standard deviations, the inter- 
correlations among all the variables 
measured, the inverse of the intercorre- 
lation matrix, a multiple coefficient of 
correlation (R), and the partial cor- 
relation coefficients (70;.;’...«) between 
the dependent variable and each inde- 
pendent variable (all remaining inde- 
pendent variables held constant). 

Variables which contributed impor- 
tantly to the multiple regression were 
selected and those which did not were 
discarded on the basis of the following: 

(a) Size of the partial 7 associated 
with each independent variable. 

(b) Careful study of each variable 
with a view to recognizing and elimi- 
nating those measuring the same thing 
or in some other way sharing a common 
contribution to the multiple regression. 
(For example, it is obvious that such an 
overlap or sharing exists between the 
two variables of misarticulated releas- 
ing items and misarticulated arresting 
items, since, as defined in this study, 
a consonant sound in a particular en- 
vironment must be either releasing or 
arresting. Either one of these measures, 
therefore, mirrors the contribution of 
the other and might be eliminated with- 
out any reduction in the size of the 
multiple R. Further, these two measures 
added together constitute a count of 
number of defective single consonants 
and thus probably share also in the 
contribution to the multiple regression 
made by the number of defective 
singles variable. The decision to elimi- 


*An IBM Model 650 Digital Computer was 
used. The computer program used was one 
developed by Cohen (3) and modified 
slightly by Dr. Dee Norton of the Univer- 
sity of Iowa. ; 








GH A =e mt (FH ft 85 eC 


ee ee... .4 


nn 


1 <a yl 





Ratings of Articulation Defectiveness: Jordan 


nate such variables can thus be relative- 
ly safely made despite other evidence, 
for example, relatively large partial 7s, 
indicative of an important contribu- 
tion.) 

(c) Consideration of the multiple Rs 
between severity and various promising 
combinations of independent variables. 
(Trial runs often helped identify vari- 
ables which could be eliminated without 
a reduction in R and, conversely, some 
variables were identified as strong con- 
tributors to the multiple regression be- 
cause of the relatively large decrease in 
R resulting from their elimination.) 

Seven variables were eliminated in the 
first stage of the analysis, mainly be- 
cause of considerations of the type 
described in (b) above. 

All further selection was done pri- 
marily on the basis of the size of the 
partial r between the independent vari- 
able under scrutiny and the dependent 
variable: those independent variables 
showing a relatively large partial cor- 
relation with the dependent variable, 
judged severity, were retained and those 
variables showing little correlation with 
severity were eliminated. 
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Following each elimination of a few, 
never more than four, independent vari- 
ables, a new inverse matrix was con- 
structed using only the remaining 
variables. A new multiple correlation 
coefficient, new regression coefficients, 
and new partial correlation coefficients 
were computed. The new partial cor- 
relation coefficients were scrutinized 
and the variables showing the smallest 
partial rs were eliminated. This pro- 
cedure was continued until but three 
independent variables remained. 


As a check on the selection pro- 
cedure just described, several different 
promising combinations of the ap- 
propriate number of independent vari- 
ables were tried at several stages in the 
analysis. A multiple R w-- computed 
for each of these combinations of vari- 
ables and these Rs were compared to 
the R obtained using the combination 
of variables selected on the basis of the 
size of their partial rs. Without excep- 
tion the multiple R associated with the 
combination of variables chosen accord- 
ing to the described procedure was 
larger. 





Taste 1. R and R* for each stage of selection of important variables; partial correlation 
coefficients associated with each of eight major independent variables at the various stages. 
The eight variables are: number of single defective sounds, age, consistency, proportion of 
omissions, proportion of arresting sounds, proportion of medial sounds, proportion of final 
sounds, and proportion of plosive sounds. 

















Number of “Measures 
Variables Sing Age Con Omis Arr Med Final Plos R* R’ 
23 09 —.18 18 12 14 —.22 —.23 03 86 74 
16 18 —.20 28 29 hb —.18 —.27 dz 85 72 
12 19 —.20 29 29 24 —.18 —.28 20 85 PY @ 
8 41 —.16 23 33 PY 9 —.19 —.28 19 85 12 
5 38 24 38 17 —.22 83 69 
3 40 17 sor 82 67 
1 78} 








*All multiple Rs significant beyond the 1% level of confidence. 
{This value is a simple Pearson r. 
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Final Selection of Variables. In Table 
1 are summarized the seven major stages 
in the analysis. Not included are those 
variables eliminated in the first three 
stages: number of defective items; num- 
ber of defective sounds; frequency of 
occurrence, French-Carter-Koenig; fre- 
quency of occurrence, Travis; consist- 
ency, singles; substitutions; distortions; 
releasing; initial; vowels; nasals; on- 
glides; fricatives; affricates; and blends. 
The partial 7s associated with these 
variables at the time of their elimination 
were: .01, .13, .15, —.05, —.01, .02, .10, 
tt oD, 01, —4, 02, ~ 81, #7, 
—.01, respectively, only one (the partial 
r associated with the initial sounds 
measure) significant at the 5% level. 
Considerations of the kind previously 
described indicated that the initial 
sounds variable actually made little, if 
any, unique contribution to the multi- 
ple regression and the inconsequential 
drop in R following its elimination con- 
firmed this view. 

As may be seen in the next to last col- 
umn of Table 1, the multiple R de- 
creased only .04, from .86 to .82, when 
20 of the 23 independent variables were 
eliminated. Perhaps the most meaningful 
way to evaluate this decrease is through 
the coefficient of determination (R?) 
which indicates the proportion of the 
variability in the dependent variable 
which can be accounted for or pre- 
dicted on the basis of variability in the 
measures of the independent variables. 
It should be kept in mind that the value 
of R?, like the value of R, is relative to 
the circumstances under which it was 
obtained. Its primary usefulness here 
is as an indicator of the relative impor- 
tance of certain independent variables or 
combinations of independent variables 


(as they were obtained in this study) 
in the prediction of severity (as it was 
obtained in this study). As is indicated 
in Table 1, 67% of the variability in 
judged severity scores can be accounted 
for on the basis of variability in only 
three sets of scores: number of defec- 
tive singles, phonetic consistency, and 
omissions. When all 23 independent 
variables are considered, 74% of the 
variance in severity is accounted for, 
a gain of only 7% at the expense of in- 
cluding 20 more independent variables. 

Defective Singles. Number of defec- 
tive singles emerged as that single vari- 
able most highly related to judged 
severity. Common variance between this 
variable and judged severity is 61%. 
Adding two more measures derived 
from articulation test responses in- 
creases the common variance by only 
6%; adding four more such meas- 
ures increases this by only 8%. If 
the reactions of the group of listeners 
used in this study may be taken 
as representative of the reactions of 
listeners in general (and on the basis 
of the experimental evidence reviewed 
earlier there seems little reason to ques- 
tion this assumption), number of 
defective singles appears to be the 
speech clinician’s best single measure 
with which to predict the severity of 
the child’s articulation problem as 
judged by his listeners. As an indirect 
quantifier of severity, as severity is de- 
fined in this study, number of defective 
singles is apparently superior to number 
of defective test items and to number 
of defective sounds. Statistical tests* 
of the relationship between judged se- 


°F = msg/MSreg with df equal to number 
of groups minus two and number of subjects 
minus number of groups. 
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verity and each of the three above 
mentioned articulation test measures, 
however, indicate a significant depar- 
ture from linearity* for the relationship 
between judged severity and number 
of defective test items. The index of 
correlation (7) for evaluation of the 
relationship between these two vari- 
ables is .92. This increase in the measure 
of relationship may be accounted for 
by a curvilinear association between 
these two variables at the upper end of 
the regression line. Equal increments 
in severity are accompanied by increas- 
ingly greater increments in number of 
defective test items at the upper end 
of the scale. For practical clinical pur- 
poses, either measure, number of 
defective test items or number of de- 
fective sounds, is a useful indicator of 
the effect of the child’s speech on the 
listener. 


Omissions. Proportion of misartic- 
ulations which were omissions is second 
in importance as a predictor of judged 
severity. It may be of interest to note 
here that the substitutions variable was 
among those measures eliminated at 
the first stage in the analysis and the 
distortions measure was eliminated at 
the third stage in the analysis. To the 
listeners these latter two types of 
sound-errors apparently were less con- 
spicuous than omissions. This result 
lends some support to the hypothesis 
“hat, in the ear of the listener, omissions 
are worse errors than substitutions and 
distortions. With this in mind, and 
with all other considerations being 
equal, the speech clinician would do 
well to work first with those sounds 
which the child omits. 


‘F = 3.82 (df = 83 and 65). 
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Phonetic Consistency. Phonetic con- 
sistency of the misarticulated sounds is 
the variable next in importance as a 
predictor of severity. As would be ex- 
pected, the more consistently the child 
misarticulates a sound or sounds, the 
more likely he is to be regarded as 
having a more severe problem. Again, 
a relatively straightforward clinical im- 
plication may be drawn from the find- 
ing; all other considerations being 
equal, the clinician would do well to 
give early attention to those sounds 
most consistently misarticulated. 


Contribution of Other Variables. 
Final Sounds. The partial r between 
severity and final sounds is negative. 
The clear implication of this result is 
that the listeners in this study were 
apparently somewhat less concerned or 
distracted by defective final sounds 
than by sounds misarticulated in other 
positions. The partial 7 associated with 
the arresting sounds variable is slightly 
smaller than that associated with final 
sounds. 


Articulation testing of consonant 
sounds in initial, medial, and final posi- 
tions has been challenged by evidence 
in studies by Stetson (16) and others 
which demonstrates the strong prob- 
ability that such sounds in running 
speech do not conform to these cate- 
gories but are more adequately 
described as syllable-releasing or syl- 
lable-arresting elements. Results of the 
present study suggest some justification 
for testing final sounds inasmuch as 
the final-sounds variable appears to 
contribute to the multiple regression 
with severity no less than and possibly 
slightly more than the arresting-sounds 
variable. 
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Age. The low, though significant, 
negative partial correlation between 
severity and age indicates that, among 
children exhibiting equal numbers and 
kinds of misarticulations, younger chil- 
dren are somewhat more likely to be 
perceived by listeners as having more 
severe problems than older children. 
The reason for this is not apparent. 
Perhaps one explanation is that the 
judges were influenced by irrelevant 
factors such as extent of vocabulary, 
complexity of sentence structure, and 
subject matter. 


Over-All Considerations. Frequency 
and Degree. From an over-all exami- 
nation of the results presented in Table 
1, measures of two factors appear to 
be necessary for the best prediction of 
severity. The one factor involves fre- 
quency, the other degree. 

Such measures as number of defec- 
tive singles, phonetic consistency, and 
frequency of occurrence measure vari- 
ous aspects of the frequency of ex- 
hibition of articulatory differences. 
Combined, these measures express the 
frequency with which a_ particular 
child’s misarticulations are presented to 
his listeners as he speaks. 

The grossness of the articulatory 
difference exhibited to the listeners was 
apparently an important factor in their 
judgment of severity. The omissions 
variable is the major representative of 
this factor of degree. 

Validity. Perhaps the most important 
result to emerge from this analysis 
was the finding that selected measures 
obtained from articulation test results 
were highly related to the reactions 
of listeners who audited the running 
speech of the child. In other words, it 
is apparent that results of articulation 


tests of the kind used in this study are 
accurately descriptive of the articu- 
latory behavior which children, show- 
ing no marked deviations of voice 
quality and rhythm, exhibit in running, 
communicative speech. Such tests may 
be considered, therefore, to have a 
high degree of validity when admin- 
istered to children who have articula- 
tion problems but who do not have 
accompanying voice or rhythm prob- 
lems. 


Intercorrelations. A Pearson r was 
computed for every possible pair of 
variables including the dependent vari- 
able. In addition to those relationships 
already discussed in connection with 
the multiple regression analysis, several 
others appear to be important on the 
basis of the obtained Pearson rs. 

Judged Severity. The single measures 
correlating most highly with judged 
severity are number of defective items, 
number of defective sounds, and num- 
ber of defective singles. The Pearson 
ys associated with these variables are 
.72, .75, and .78, respectively. 

Both measures of frequency of oc- 
currence in the language correlate sig- 
nificantly and negatively with severity. 
The correlation with the French-Car- 
ter-Koenig frequency weights is —.25; 
that with the Travis weights is —.34. 
These rs indicate that as severity de- 
creases, there tends to be an increase in 
mean frequency of occurrence in the 
language of the sounds which are mis- 
articulated. This seeming paradox re- 
lates to the fact that sounds which 
children ordinarily master last are 
sounds occurring quite frequently in 
the language (23). Thus children mis- 
articulating just a few sounds are likely 
to misarticulate sounds with relatively 
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high frequency-of-occurrence weights. 
When the effects of other variables 
were held constant, however, the par- 
tial correlation between severity and 
frequency of occurrence (French-Car- 
ter-Koenig weights) was a positive .15. 
This indicates that, when the effects of 
other variables, particularly differences 
in the number of sounds defective, 
were held constant, children misartic- 
ulating sounds with higher frequencies 
of occurrence in the language were 
somewhat more likely to be judged as 
having more severe problems than chil- 
dren misarticulating sounds associated 
with lower frequencies of occurrence. 


The three Pearson correlations (.70, 
—.09, and —.43) between the severity 
variable and each of the type-of-error 
variables (proportions of omissions, 
substitutions, and distortions) are of 
especial importance clinically. Pro- 
portion of omissions correlates strongly 
and positively with severity. A slight 
inverse relationship between severity 
and proportion of substitutions is sug- 
gested. There is a definite, moderate 
tendency for severity to decrease as 
the proportion of distortions increases. 
Apparently the listeners in this study 
reacted to omissions as the most severe 
form of misarticulation and to distor- 
tions as the least severe. These results 
provide some basis in experimental fact 
for ranking the three categories of 
misarticulations in terms of their lis- 
tener distraction value, from least to 
most distracting, in this order: distor- 
tions, substitutions, omissions. 


Among the measures of phonetic 
category of the error, trends of two 
are to be noted: first, the tendency for 
the proportion of misarticulated nasal 
sounds to increase as severity increases 
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(r = .40); and, second, the correspond- 
ing tendency for the relative propor- 
tion of fricatives among the misartic- 
ulated sounds to decrease as severity 
increases (r = —.40). These trends 
are perhaps best explained through ref- 
erence to corresponding trends in the 
relationships between the three counts 
of number of defective sounds and the 
measures concerned with nasals and 
fricatives. For example, the correlation 
between number of defective singles 
and nasal sounds is .40 and the correla- 
tion between number of defective 
singles and fricative sounds is —.43. 
Those children misarticulating a greater 
number of sounds tended to include 
proportionately more nasal sounds 
among their sound-errors than those 
children who misarticulated relatively 
few sounds. It seems probable that chil- 
dren misarticulating relatively few 
sounds confine their errors, in general, 
to fricative sounds. It also seems prob- 
able that as misarticulations increase 
in number, sounds in other phonetic 
categories tend to be involved at the 
expense of the relative proportion of 
fricatives. Supporting this interpreta- 
tion is the fact that all phonetic cate- 
gories except fricatives relate positively 
to the number of defective-sound 
counts. Since number of defective 
sounds, however measured, relates so 
highly to severity, the trends described 
above are reflected also in the rela- 
tionships between severity and the 
measures of phonetic category of mis- 
articulations. 

The proportion of misarticulations 
occurring in the blends measure acts 
primarily as an inverse indication of 
number of defective singles and is, 
thus, always negatively related to vari- 
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ables which are positively related to 
number of defective singles and vice 
versa. The negative association of 
blends with severity (r = —.46) is 
one example of this general observation. 

Measures of Number of Defective 
Sounds or Items. The three counts of 
number of misarticulations relate simi- 
larly to other variables and therefore 
are considered together. The Pearson 
rs indicating strengths of relationships 
between all pairs of these variables are 
.82 for number of defective items with 
number of defective sounds, .87 for 
number of defective items with num- 
ber of defective singles, and .90 for 
number of defective sounds with num- 
ber of defective singles. These rs re- 
veal the high degree of interdependence 
among these three variables. 

Moderate negative correlations be- 
tween the counts of defective sounds 
and frequency of occurrence in the 
language, ranging from an r of —.37 
to an 7 of —.39, indicate again, as pre- 
viously mentioned in the discussion of 
partial rs, the tendency for children 
misarticulating relatively few sounds 
to misarticulate sounds which occur 
relatively frequently in the language. 

Significant negative correlations be- 
tween counts of defective sounds and 
chronological age, ranging from an r 
of —.27 to an r of —.29, indicate that 
as age increases, number of defective 
sounds or test items decreases. This 
finding is in complete agreement with 
previous findings concerning chrono- 
logical age and the number of sounds 
misarticulated (23). 

There is a very strong tendency for 
the phonetic consistency of misarticu- 
lations to increase as the number of 
sounds misarticulated increases; the 


correlations of defective items, of de- 
fective sounds, and of defective singles 
with consistency are .80, .51, and .69, 
respectively. This observation is in 
agreement with usual clinical findings 
and, since it expresses a functional re- 
lationship between the variables men- 
tioned, the general observation that 
children’s misarticulations are likely to 
be inconsistent (15) is made somewhat 
more useful. That is, the clinician can 
expect relatively more inconsistencies 
which are clinically useful in working 
with children who have relatively few 
defective sounds and vice versa. 

The relationships noted between 
severity and the three types of sound- 
errors exist to the same degree between 
the counts of defective sounds and the 
types of sound-errors (omissions, sub- 
stitutions, and distortions). This set of 
relationships may be expressed briefly 
as follows: in general, as the number 
of sounds the child misarticulates in- 
creases, the proportion of omissions 
among these sounds increases, the 
proportion of substitutions decreases 
slightly, and the proportion of distor- 
tions decreases more markedly. 

Among the trends for measures of 
phonetic category of sound misarticu- 
lated are significant tendencies for the 
relative numbers of nasals and onglides 
to increase and for the relative number 
of fricatives to decrease as number of 
defective sounds, however counted, in- 
crease. For the reason noted in the 
discussion under Severity, above, the 
relative number of misarticulations in 
blends decreases as the number of mis- 
articulations increases (r = —.70). 

Frequency of Occurrence. The two 
measures concerned with the frequency 
of occurrence in the language, in gen- 
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eral, have the same relationship to other 
variables. This would be expected since 
they correlate highly with each other 
(r = 90). 

A moderate tendency for frequency 
of occurrence to be inversely related 
to phonetic consistency (r = —.41) 
may be explained through reference 
to the relatively strong correlation be- 
tween phonetic consistency and num- 
ber of defective sounds cited above. 
These two related measures are both 
inversely related to frequency of oc- 
currence for the reasons presented in 
the discussion under Measures of Num- 
ber of Defective Sounds or Items, 
above. 

A moderate correlation between fre- 
quency of occurrence and vowels, .39, 
probably reflects the fact that various 
/t/ and /l/ sounds were considered 
vowels for the purpose of this study 
and these sounds occur relatively fre- 
quently in the language. 

Chronological Age. Among the more 
important correlations with age is a 
negative association between chrono- 
logical age and phonetic consistency 
(r = —.25), as might be expected. This 
indicates that the older children tended 
to be somewhat less consistent in their 
misarticulations than the younger chil- 
dren. 

Tendencies toward fewer omissions 
and more distortions as age increases 
are indicated by significant rs of —.25 
between omissions and age and .43 be- 
tween distortions and age. 

Phonetic Consistency. The most rea- 
sonable measure of phonetic consist- 
ency of misarticulations, on a priori 
grounds, is the measure utilizing all of 
the articulation test items. It will be 
recalled that this measure contributed 
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more to the multiple regression with 
severity than did the measure which 
included just the consonants as singles. 
For these reasons only the former 
measure will be discussed here. 

As the Pearson rs indicate, those chil- 
dren whose misarticulations are rela- 
tively more consistent, tended to 
exhibit relatively more omission errors 
(r = .38); those children with rela- 
tively less consistent misarticulations 
tended to exhibit relatively fewer omis- 
sion errors and- relatively more distor- 
tion errors (r = —.31). 

A strong negative correlation be- 
tween phonetic consistency and the 
proportion of sounds misarticulated in 
blends, —.69, indicates that as misartic- 
ulations became more consistent, a rel- 
atively lower proportion of them took 
place in blends. In other words, the 
proportion of misarticulations involv- 
ing the sounds tested as_ singles 
increased and the proportion of mis- 
articulations involving blends decreased. 


Type of Sound-Error. The inter- 
correlations among omissions, substi- 
tutions, and distortions reflect their 
necessary interdependence. When one 
member of the three shows a strong 
positive association with another vari- 
able, at least one other member of the 
three must relate negatively since, for 
a given number of defective sounds, an 
increase in the proportion of one type 
of sound-error must be accompanied 
by a decrease in the proportion of at 
least one of the other two types of 
error. For example, as the proportion 
of misarticulated nasal sounds increases, 
the proportion of omissions increases 
and the proportion of distortions de- 
creases. The relationships between na- 
sals and the types of sound-errors can- 
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not be considered as indicative of a 
cause and effect association among 
them. The correlations among these 
variables are, in all probability, the sec- 
ondary result stemming from thei: sig- 
nificant correlations with another 
variable, severity. A similar situation 
exists for the correlations between 
omissions and fricatives and between 
substitutions and fricatives. Here, 
however, the proportion of  fric- 
ative sounds misarticulated decreases 
as the proportion of omission errors 
increases and the relative number of 
misarticulated fricatives increases as 
substitutions increase. Again, this set 
of relationships is largely dependent 
upon correlation of these variables with 
severity. 


Summary and Conclusions 


The basic problem was to evaluate, 
by means of a multiple regression 
analysis, relationships between 22 meas- 
ures obtained from phonetic analysis 
of 150 children’s articulation test re- 
sponses and measures of defectiveness 
of articulation obtained from listener 
ratings of their connected speech. 
Tape-recorded 30-second speech sam- 
ples were rated on a nine-point equal- 
appearing intervals scale by 36 listeners. 

Results indicated the following con- 
clusions: 

(a) Articulation test responses, un- 
der the conditions of this experiment, 
provide valid information on articula- 
tory behavior in connected speech. 

(b) Reactions of listeners to artic- 
ulation defectiveness are primarily de- 
pendent upon two factors: frequency 
with which articulatory deviations 
occur and degree of articulatory devi- 
ations. 


(c) To the listener, omissions are 
more deviant than substitutions and 
substitutions are more deviant than 
distortions. 

(d) Articulation test measures of 
number of defective items and num- 
ber of defective single sounds are both 
highly related to measures of defective- 
ness of articulation derived from lis- 
tener responses to connected speech. 
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Use of Bobath Principles 
in Cerebral Palsy Habilitation 


> Included in the August, 1959, issue of 
the Journal of Speech and Hearing Dis- 
orders was an article entitled, ‘Significance 
of Neurophysiological Orientation to Cere- 
bral Palsy Habilitation,’ which contained a 
footnote stating that a pilot study, based on 
the utilization of the basic principles of the 
Bobath approach to cerebral palsy treatment, 
was in progress at the Newington Hospital 
for Crippled Children. This is to indicate 
that the exploratory investigation has been 
completed. 

The study had three general purposes: 
(a) to determine whether the application 
of particular neurophysiological techniques 
could alter the reflexology of a group of 
children with cerebral palsy; (b) to acquaint 
concerned Newington staff with the thera- 
peutic rationale and procedure; (c) to serve 
as a training opportunity for interested 
therapists. 
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B> RESEARCH NEWS NOTE 


Pre- and post-therapy views of a series 
of reflexes and general motor development 
were organized into a pilot study film and 
evaluated. These picture records indicated 
that the reflexology of all children studied 
was positively influenced to varying degrees, 
and, therefore, it was felt that the approach 
warrants attention as well as further re- 
search. The initiation of a full-scale con- 
trolled study will depend on success in 
soliciting the necessary funds for such a 
large undertaking. 

The above described pilot study film, as 
well as an initial film concerned with pre- 
senting theory and therapy rationale, are 
available on a rental basis. 


Edward D. Mysak, Ph.D. 

Project Director 

Newington Hospital for Crippled 
Children 

Newington, Connecticut 
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Visual Word Recognition 
by Deaf and Hearing Children 


DONALD G. DOEHRING 


JOSEPH ROSENSTEIN 


Until about the age of six, hearing 
children develop verbal language skills 
almost exclusively by means of the audi- 
tory system. Deaf children usually do 
not understand spoken language until 
they have been given intensive training 
in speechreading, which is customarily 
initiated at an age when most hearing 
children have become fluent in the use 
of spoken language. Consequently, the 
deaf child must acquire written as well 
as oral language without benefit of the 
wealth of previous auditory verbal ex- 
perience that is available to the hearing 
child. Studies of visual recognition (2) 
have demonstrated that visual recogni- 
tion thresholds for words vary as a 
function of relative familiarity with 
the stimulus words. Furthermore, as 
discussed by Howes (3), some writers 
have stated that the recognition thresh- 
old of a word varies as a function of 
the number of times that an individual 
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has previously responded to either visual 
or auditory presentation of the word. If 
such is the case, deaf children should 
tend to be much less accurate in visual 
word recognition than hearing chil- 
dren, since they lack the previous 
experience with the auditory verbal 
stimulation that constitutes a very 
large proportion of the total verbal 
stimulation of hearing children. 


The present study represented an 
attempt to specify the effect of retarda- 
tion in the development of spoken lan- 
guage on the ability of deaf children to 
recognize visually-presented verbal 
stimuli. The performance of deaf chil- 
dren was compared with that of hearing 
children in the accuracy of visual rec- 
ognition of briefly-exposed _ letters, 
trigrams (combinations of three letters 
that do not constitute words), and 
four-letter words. Single letters and 
trigrams were included as stimulus ma- 
terial in order to determine whether 
any difficulty in the recognition of 
words by deaf children might be re- 
lated to deficiencies in the visual per- 
ception of single symbols or of non- 
meaningful combinations of symbols. 
Each group of children was subdivided 
into two age groups in order to deter- 
mine whether any differences in recog- 
nition ability between deaf and hearing 
children might tend to disappear when 
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deaf children have had more opportu- 
nity to acquire certain requisite lan- 
guage skills. A measure of reading vo- 
cabulary was obtained for each child, 
since a child’s ability to recognize ta- 
chistoscopically-presented words might 
vary as a function of the size of his 
reading vocabulary; and it seemed likely 
that the deaf children would differ con- 
siderably from the hearing children 
with respect to reading vocabulary. 


Procedure 


Subjects. Subjects were 40 orally- 
trained deaf children and 40 hearing 
children. Half of the children in each 
group ranged in age from 9 to 11 years 
and the other half ranged in age from 
12 to 16 years, with mean ages of 10.3 
for the young deaf group, 10.3 for the 
young hearing group, 13.9 for the older 
deaf group, and 13.8 for the older hear- 
ing group. The deaf children could not 
be matched with the hearing children 
according to IQ since the IQ scores 
available for the hearing children were 
based on the California Test of Men- 
tal Maturity (6), a group test of verbal 
and nonverbal ability, and the IQ scores 
of the deaf group were based on the 
Advanced Performance Scale (4), an 
individual test of nonverbal ability. 
Both the deaf and the hearing groups 
were, however, selected to be repre- 
sentative in terms of the distribution of 
IQ scores at their respective schools. 
The mean IQ was 117 for the young 
deaf group (SD = 10.34), 116 for the 
young hearing group (SD = 9.64), 117 
for the older deaf group (SD = 13.85), 
and 120 for the older hearing group 
(SD = 14.07). The deaf children were 
from Central Institute for the Deaf and 
the hearing children were from public 


schools in University City, Missouri. 
Impairment of hearing had occurred at 
or before birth for 21 of the deaf chil- 
dren, before age one for eight of the 
children, and before age 2.5 for the 
remaining 11 children. Children with 
corrected visual defects were required 
to wear their glasses during the test. 
Children with possible uncorrected 
visual defects were eliminated from the 
study. A child’s ability to recognize the 
three practice letters provided an addi- 
tional check on the type of visual 
acuity necessary for performance of 
the experimental task. 

Thresholds for visual word recog- 
nition are usually determined by the 
ascending method of limits (2), but 
this method is extremely time consum- 
ing, and the hearing children were 
made available for only a limited period 
of time. In the procedure adopted for 
this study each test stimulus was pre- 
sented only once, and the measure of 
accuracy of recognition was percentage 
of correct responses for each type of 
visual stimulus. 


Apparatus. Stimuli were projected 
by means of a 35-mm slide projector 
on a small ground glass screen located 
at one end of a darkened viewing box. 
Duration of exposure of the stimuli 
was kept constant at .01 sec by the 
use of a shutter, and background 
illumination of the stimuli was held 
constant at an extremely low level by 
the use of a polaroid filter. The sub- 
ject looked into the viewing box 
through an opening 14 in. in front 
of the screen on which the stimuli 
were projected. The use of a hood on 
the viewing box obviated the necessity 
for exact control of ambient light with- 
in the testing room. Stimuli were let- 
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tered on tracing paper and mounted in 
cardboard slide holders. All of the let- 
ters were capitalized, and the height of 
the projected letters was approximately 
0.12 in. 


Stimulus Material. Stimulus material 
for the test consisted of 10 single letters, 
20 trigrams (10 pronounceable, 10 un- 
pronounceable), and 30 four-letter 
words (10 each of high, middle, and 
low frequency). Three stimuli for each 
of the three types of material were in- 
cluded for practice. Following are the 
letters and trigrams used, their fre- 
quency (5, pp. 252; 264-278) per 1000 
and 20000 words, respectively, given 
in parentheses: 


Bo 473)-SQ75) 3D i171) oF 32) € 
(124), G (90), P (89), B (65), V (41), 
Z (3), BEC (15), DEP (10), TIV (30), 
GES (13), FAC (21), VEP (0), ZIF (9), 
SEB (0), GOK (0), TUZ (0), GHT 
(41), DST (10), MPT (10), RCH (14), 
STR (57), BRV (0), ZFN (0), GPL (0), 
CMD (0), and FBX (0). 


The stimulus words were as follows, 
their frequency (7) per million words 
given in parentheses: 


SALT (100+), FISH (100+), LAND 
(100+-), MILK (100+-), BIRD (100+), 
DOWN (100+), SUCH (100+), LONG 
(100+), MOST (100+-), NEXT (100+), 
BATH (46), BELT (48), DESK (50 to 
100), SAND (50 to 100), CORN (50 to 
100), LACK (50 to 100), PINK (50 to 
100), TERM (50 to 100), DAWN (50 to 
100), VAST (50 to 100), RAFT (7), 
CURB (14), FERN (13), ZINC (10), 
GORK (11), LURK (15), CULT. 5), 
GARB (7), REND (8), and MESH (6). 


The words were taken from the Thorn- 
dike-Lorge (7) general count. Addi- 
tional criteria were that every word 
must be a single syllable, consonant- 
vowel-consonant-consonant combina- 
tion, that there should be no repeated 
letters within words, and that there 


should be no emotionally toned words 
or words with special connotations for 
either group of children. 


Testing. Each child was tested indi- 
vidually in his own school in a room 
that was comparatively free from dis- 
traction. The child was seated in front 
of the viewing box, given specific in- 
structions concerning the proper posi- 
tion for looking into the box, and told 
that he would first be presented with 
a series of letters of the alphabet. He 
was given an answer sheet and a pencil, 
and was asked to write down the letter 
that appeared on the screen. It was 
emphasized that he should guess when- 
ever he was not sure of the letter. The 
three practice items preceded the test 
items for each of the three types of 
material, first the letters, then the 
trigrams, then the words. Each prac- 
tice item was presented until the child 
made a correct response; each time he 
was urged to guess if he was not sure. 
The trigrams were described as groups 
ef three letters that did not make a 
word, and the words were described as 
words containing four letters. 

The interval between trials was de- 
termined by the speed with which the 
child wrote his response. The experi- 
menter did not present a new stimulus 
until the subject had finished writing 
and had leaned forward once again to 
look into the viewing box. For the hear- 
ing children the experimenter said 
‘Ready’ or gave the number of the trial 
just before presenting each stimulus. 
For the deaf children the experimenter 
made certain that the child’s attention 
was properly directed by pointing to 
the opening in the viewing box just be- 
fore presenting each stimulus. Stimuli of 
each type were presented in random 
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Ficure 1. Percent of correct responses made 
by deaf and hearing children on the reading 
vocabulary test and in the tests of visual 
recognition of letters, trigrams, and words. 


order, with the same random order used 
for all subjects. 

The Ammons Full-Range Picture 
Vocabulary Test (FRPV) (J) was 
individually administered to each sub- 
ject immediately after completion of 
the visual recognition test. The recom- 
mended test procedure was modified 
by presentation of the test words for 
Form A of the FRPV Test in type- 
written form on 3” x 5” cards rather 
than orally. This procedure was 
adopted because a reading vocabulary 
score seemed more relevant for the 
present study than a listening vocabu- 
lary score, and because the necessity of 
speechreading would have placed the 
deaf children at a considerable disad- 
vantage if the words had been pre- 
sented orally. About 30 minutes was 
required for presentation of both the 
recognition test and the FRPV Test. 
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Results 


Figure 1 shows the percentage of cor- 
rect responses made by each group on 
the vocabularly test and on the three 
parts of the recognition test. With re- 
spect to the size of their reading vo- 
cabulary, the deaf children and hearing 
children represented two distinctly 
separate populations. The range of total 
correct responses on the FRPV Test 
was 13 to 31 for the young (CA 8 to 
11) deaf children, 35 to 57 for the 
young hearing children, 24 to 52 for 
the older (CA 12 to 16) deaf children, 
and (with the exception of a score of 
31) 56 to 81 for the older hearing 
children. 

The hearing children also tended to 
be more accurate than the deaf children 
in letter, trigram, and word recognition, 
as shown in Figure 1. However, the 
difference between the older groups 
was relatively small. The older deaf 
children, compared with the older hear- 
ing children, were 10% less accurate 
in recognition of the 10 single letters, 
10% less accurate in recognition of the 
20 trigrams, and only 4% less accurate 
in recognition of the 30 four-letter 
words. The young deaf children com- 
pared with the young hearing children 
were, on the other hand, 15% less accu- 
rate in letter recognition, 32% less ac- 
curate in trigram recognition, and 31% 
less accurate in word recognition. These 
results were not analyzed by analysis 
of variance because of a marked skew- 
ness in several of the distributions that 
resulted from a relatively large number 
of perfect or near-perfect perform- 
ances. Fewer than two errors in letter 
recognition were made by seven of the 
young deaf children, 12 of the young 
hearing children, 14 of the older deaf 
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children, and 18 of the older hearing 
children; fewer than two errors in tri- 
gram recognition were made by one 
young deaf child, one young hearing 
child, six of the older deaf children, and 
seven of the older hearing children; and 
fewer than two errors in word recog- 
nition were made by one young deaf 
child, four of the young hearing chil- 
dren, nine of the older deaf children, 
and 11 of the older hearing children. 
Consequently, a nonparametric test, the 
Mann-Whitney Test (8, p. 434) was 
used for statistical analysis of perform- 
ance on the recognition tests. With the 
results thus analyzed, the young deaf 
children were significantly less accu- 
rate than the young hearing children 
in letter recognition (p = .038), tri- 
gram recognition (p = .0002), and 
word recognition (p <.0001). The dif- 
ferences in accuracy of recognition be- 
tween the older deaf group and the 
older hearing group were not statisti- 
cally significant for letter recognition 
(p = .072), trigram recognition (p = 
.267), or word recognition (p = .284). 


The list of trigrams had been selected 
to include both pronounceable (CVC) 
and unpronounceable (CCC) syllables 
in order to determine whether the 
greater experience of the hearing chil- 
dren in oral pronunciation would pro- 
vide them with a special advantage in 
performance of the experimental task. 
Since the deaf children made 10% 
fewer errors and the hearing children 
only 4% fewer errors on the pro- 
nounceable trigrams than on the un- 
pronounceable trigrams, it appeared 
that, if anything, the deaf children were 
aided more by pronounceability in the 
list of trigrams than were the hearing 


children. 


The expected relationship between 
frequency of word usage in written 
language and accuracy of word recog- 
nition was found in the young deaf 
group, with correct recognition by the 
young deaf children of 72% of the 
high frequency words, 55% of the 
middle frequency words, and 31% of 
the low frequency words. In the re- 
maining groups this relationship was at- 
tenuated by the generally high per- 
centage of accuracy in word recog- 
nition. Over 90% of the middle fre- 
quency and high frequency words were 
recognized correctly by all three of 
these groups, while the accuracy of rec- 
ognition of the low frequency words 
was 70% for the young hearing group, 
80% for the older deaf group, and 84% 
for the older hearing group. 


Discussion 


The level of performance of the 
young deaf children on the reading 
vocabulary test and on the three tests 
of visual recognition was clearly be- 
low that of the young hearing chil- 
dren. Although the older deaf children 
were as retarded in reading vocabulary 
relative to the older hearing children 
as were the young deaf children relative 
to the young hearing children, the 
performance of the older deaf children 
did not differ significantly from that 
of the older hearing children on the 
three tests of visual recognition. Since 
the majority of the children in both 
of the older groups made no errors in 
letter recognition, no conclusions can 
be drawn concerning the relative ability 
of the two groups in letter recognition. 
However, the distributions of scores 
for trigram recognition and word rec- 
ognition were not sufficiently skewed 
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to prevent the conclusion that the older 
deaf group performed at about the 
same level of accuracy as the older 
hearing group in trigram recognition 
and word recognition. 

The finding that the older deaf chil- 
dren, despite an extreme retardation in 
reading vocabulary, were able to recog- 
nize briefly exposed words at essentially 
the same level of accuracy as their hear- 
ing peers provides support for Howes’ 
(3) contention that the accuracy of 
recognition of visually-presented words 
is dependent upon a subject’s estimate 
of the probability of occurrence of a 
word at the time of presentation rather 
than upon the sheer frequency of the 
subject’s previous responses. Such a 
probability estimate would consist of 
the subject’s guess as to the type of 
stimulus material that he would most 
likely be presented with, and the sub- 
ject’s response would be an estimate 
that conformed with the perceived 
structural characteristics of the stimu- 
lus. Apparently the older deaf children 
made about the same probability esti- 
mates as the older hearing children, 
since their accuracy of word recog- 
nition would have been much lower 
than that of the older hearing children 
if sheer frequency of previous responses 
to the stimulus words were the de- 
termining factor in word recognition. 
Sheer frequency of previous responses 
to words might, however, have a more 
important influence on accuracy of 
recognition at an earlier stage in lan- 
guage acquisition, before overlearning 
is achieved. The results of this study 
suggest that both of the groups of hear- 
ing children and the older deaf chil- 
dren had approached a stage of over- 
learning with respect to the type of 
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stimulus material used in this experi- 
ment. The results further suggest that 
when overlearning occurs, size of read- 
ing vocabulary does not operate as an 
important determinant of accuracy of 
word recognition. The young deaf 
children’s lower accuracy of recogni- 
tion of all three types of verbal material 
could be explained by the assumption 
that as a result of their initial retar- 
dation in spoken language, they had not 
yet reached the stage of overlearning 
for the type of stimulus material pre- 
sented in this study. It is, however, in- 
teresting to note that the relation be- 
tween accuracy of word recognition 
and word frequency that has been ob- 
served repeatedly in adults was also 
observed in the group of young deaf 
children, who were presumably in a 
relatively early stage of language ac- 
quisition. 

Further investigation would be neces- 
sary for a more exact specification of 
those visual-verbal abilities in which the 
deaf child is able to achieve a normal 
level of performance in contrast to 
those abilities in which he continues to 
perform well below the level of his 
hearing peers even after a number of 
years of formal instruction. Such in- 
formation would aid in the planning of 
education programs for deaf children, 
and also would provide some insight 
into the general question of how verbal 
skills are acquired. 


Summary 


A test for accuracy of visual recogni- 
tion of briefly-exposed letters, trigrams, 
and four-letter words was administered 
to groups of 40 deaf children and 40 
hearing children ranging in age from 
eight through 16 years. All subjects also 
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were given the Ammons Full-Range 
Picture Vocabulary Test (FRPV), 
with the stimulus words presented 
in typewritten form. The young (CA 
8 to 11) hearing children were signifi- 
cantly more accurate in letter, trigram, 
and word recognition than the young 
deaf children, but the older (CA 12 to 
16) deaf children did not differ signif- 
icantly from the older hearing children 
with respect to letter, trigram, or word 
recognition. However, the FRPV read- 
ing scores of both the young and the 
older deaf children were significantly 
smaller than those of their hearing 
peers. It was concluded that accuracy 
of visual recognition of verbal material 
by the older deaf children was depend- 
ent upon an estimate of the probability 
of occurrence of the verbal stimulus 
rather than upon the mere frequency 
of prior visual and auditory stimulation. 
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Hearing Levels and Types 


of Hearing Loss 


among Selected Air Force Personnel 


LENNART L. KOPRA 


In the evaluation of hearing loss among 
noise-exposed individuals, medical per- 
sonnel are faced with many problems in 
determining if a causal relationship 
exists between job noise and hearing 
loss. Basic questions which confront the 
the interpreter of audiometric results 
include: (a) How do the test results 
compare with an accepted standard of 
normal threshold sensitivity? (b) What 
is the type of hearing loss, and is this 
type of loss related to a given job- 
noise environment? 

The present investigation had three 
purposes: (a) to compare the hearing 
acuity of men who work in noise with 
those who do not; (b) to compare the 
auditory thresholds of these men with 
normative data from other studies; (c) 
to determine the incidence of conduc- 
tive, perceptive, and mixed-type hear- 
ing loss within this population. 


Procedure 


Subjects. The 125 male subjects for 
this study were Air Force personnel 
stationed at Bergstrom Air Force Base. 
They were grouped according to on- 
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the-job noise exposure and classification 
of their hearing and are referred to by 
their hearing classifications, as follows: 

Class A, with no hearing loss greater 
than 15 db from 500 through 6000 cps, 
25 non-noise-exposed individuals (to be 
referred to as Class Ann) and 25 job- 
noise-exposed individuals; 

Class B, with hearing loss of more 
than 15 db in either ear at any fre- 
quency from 500 through 6000 cps but 
averaging not more than 15 db for the 
three frequencies of 500, 1000, and 2000 
cps, 50 job-noise-exposed individuals; 

Class C, with average hearing loss in 
either ear of more than 15 db at 500, 
1000, and 2000 cps, 25 job-noise-ex- 
posed individuals. 

Median age ranged from 20.2 years 
for the Class A group to 24.4 years for 
the Class C group. 

A fourth hearing classification is used 
in this study but only to specify a hear- 
ing level; it does not refer to a subject 
group as do the three classifications 
above. It is known as Class CAF and is 
defined as an average hearing loss in 
either ear of 20 db or more for the 
three speech frequencies of 500, 1000, 
and 2000 cps. 

According to Air Force Regulation 
160-3, dated 29 October 1956, the hear- 
ing status of an individual is identified 
by his worse ear. An individual with 
a Class C hearing status (classification 
based on his worse ear) may have an 
opposite ear which meets the criteria 
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for Class A, Class B, Class C, or Class 
CAF. For this reason, ear classification 
as well as individual status is considered 
in the present study. For this purpose, 
ears were classified as Type A, Type B, 
Type C, and Type CAF, using the 
same criteria as for classification of in- 
dividuals in Class A, Class B, Class C, 
and Class CAF, respectively. 


As used in this study, a Type C ear is 
one in which the average loss for 500 to 
2000 cps is more than 15 db as con- 
trasted to Class C hearing (called Class 
CAF in the present study) as defined 
in Air Force Regulation 160-3 where 
Class C hearing is identified as average 
worse-ear hearing of 20 db or more. 
Since the 500 to 2000 cps pure-tone 
average is a good estimate of the hear- 
ing loss for speech and because it is 
generally accepted that hearing losses 
in excess of 15 db appear to be signifi- 
cant in terms of hearing adequately in 
social situations, an average of more 
than 15 db appeared more desirable 
than the 20-db cut-off point for Class 
C hearing. For this reason, in this report 
Class C hearing is based upon a 500 to 
2000 cps average of more than 15 db. 


The 25 non-noise-exposed personnel 
used in this study do not necessarily 
represent a typical sample of non-noise- 
exposed Air Force personnel. The in- 
dividuals chosen with Class A hearing 
were from a group of non-noise-ex- 
posed men in which the incidence of 
Class A, Class B, and Class C hearing is 
unknown. From an originally identified 
group of 41 flight-line personnel with 
Class C hearing (7), 25 were available 
for testing in this study. The number of 
subjects in the noise-exposed groups 
was based upon desirable sample size 
and does not represent proportional 


sampling of Class A, B, and C hearing 
among noise-exposed Air Force flight- 
line personnel. 

The 25 Class Ann individuals had 
duty assignments at the 4473rd USAF 
Hospital. The Class A, Class B, and 
and Class C individuals (with one ex- 
ception) had duty assignments on the 
flight line, and their jobs intermittently 
exposed them to noise levels ranging 
from 90 db to approximately 135 db. 
Each man was exposed from a few 
minutes to a few hours per day to 
criterion-level noise which in this study 
refers to on-the-job noise which partly 
or totally masks loud speech close to 
the ear of the listener. Noise causing 
this amount of difficulty approximates 
an over-all level of 95 db or greater 
for a broad spectrum noise. No effort 
has been made in. this report to quantify 
the noise levels and exposure duration 
for the personnel engaged in noisy jobs, 
such as aircraft maintenance. 


Apparatus and Method. A Beltone 
Model 15A audiometer with Tele- 
phonics TDH-39 earphones was used in 
the administration of the pure-tone 
audiometric tests. Subjects were tested 
in an Industrial Acoustics Company 
Model 401 audiometric testing room 
which was installed in one of the wards 
of the base hospital. The audiometer 
room met the requirements of specifica- 
tions set forth in the Air Force Regu- 
lation 160-125, dated 13 August 1957 
and as discussed by Cox (3). 


The tests were administered during 
a four-month period. Three times dur- 
ing this period the audiometer ear- 
phones were calibrated at the USAF 
School of Aviation Medicine according 
to the procedure recommended by the 
National Bureau of Standards (1). Con- 
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Tasie 1. Twenty-fifth percentile, 50th, and 75th percentile hearing level in decibels for right 
ears (R) and left ears (L) of 25 Class Ann subjects (median age 21.6 years; age range 18 to 
34 years), 25 Class A subjects (median age 20.2 years; age range 18 to 39 years), 50 Class B 
subjects (median age 22.2 years; age range 18 to 44 years), and 25 Class C subjects (median 
age 24.4 years; age range 19 to 44 years). 











Percentile Ear Frequency (cps) 
250 500 1000 1500 2000 3000 4000 6000 
Class Ann 
25th R -10.8 -11.2 78 -7.4 -9.4 -2.3 -0.3 -3.8 
L -12.5 -9.6 -8.3 -7.4 -6.7 ~4.2 0.7 -5.1 
re a e 2 5 2 
50th R 6.5 6.2 cf 3.7 5.9 24 2.9 1.6 
L -5.3 -5.5 -4.2 -3.1 > ae 0.6 4.6 0.5 
75th R 1.1 -1.7 -0.5 1.9 -2.7 6.4 94 5.6 
L -0.1 0.3 1.0 1.2 0.2 Sef 11.8 FP 
Class A 
25th R -5.3 -7.1 49 -7.8 -8.9 ~4.7 -1.3 -2.7 
L -5.0 -5.6 -4.0 -7.0 -5.4 -0.6 0.8 3.2 
-0. -3. -1) 5, -5.4 -1.1 22 3.6 
50th R 0.1 3.2 1.9 0 5.4 - 
L 0.4 0.3 0.5 -2.3 -0.7 6.0 5.3 7.9 
75th R 4.4 2.0 14 0.5 -0.3 2.8 8.9 7.5 
L 4.4 5.3 5.0 2.2 3.6 9.5 11.8 12.1 
Class B 
25th R -6.1 -8.3 -5.2 -7.5 -7.4 +4 9.3 6.5 
L -5.9 -6.6 -5.4 -8.1 -7.0 79 16.5 15.2 
50th R -1.3 -3.6 -14 -1.3 -2.8 10.6 yh 23.8 
L -0.7 -24 0.1 -2.0 0.8 14.0 29.7 29.4 
75th R 3.8 0.8 2.9 3.9 4.0 25.0 47.9 52.7 
i 4.6 as 4.9 4.3 76 22.8 49.1 56.9 
Class C 
25th R 2.1 0.4 4.2 4.6 6.0 17.3 24.2 22.6 
y L 14 -1.3 8.8 11.2 12.9 23.3 24.2 18.1 
50th R 8.9 10.2 13.6 19.9 19.2 34.1 42.4 41.1 
L 12.0 14.6 17.8 20.9 29.1 43.3 42.8 43.1 
75th R ait 20.6 27.4 419 38.8 49.8 62.4 69.2 
L 27.0 252 35.2 44.7 46.5 56.6 60.5 56.5 








sidering all test frequencies, the range were applied to the mean-threshold 
of calibration corrections for SPL out- and median-threshold data, so that hear- 
put was from minus 5.9 db to plus 3.5 ing loss is reported relative to the 
db. Corrections to the closest 0.1 db American Standard audiometer zero 
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(3). Frequency calibration results on 
three separate occasions showed less 
than 3% error for all test frequencies. 
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Figure 1. Mean air-conduction and _ bone- 
conduction thresholds for right ears and left 
ears of Class Ann, Class A, Class B, and Class 
C groups. 


For air-conduction thresholds, fre- 
quencies were tested in the following 
order for all subjects: 1000, 1500, 2000, 
3000, 4000, 6000, 1000, 500, and 250 
cps. For bone-conduction tests the 
order was as follows: 1000, 2000, 4000, 
1000, 500, and 250 cps. In determining 
thresholds, the examiner used a modified 
psychophysical method of limits. A 
multiple threshold-crossing technique 
was used to establish threshold sensi- 
tivity. The hearing-loss dial setting at 
which a 50% correct response to a 
series of tone presentations was noted 
was recorded as the threshold for the 
frequency under test. It should be 
pointed out that some individuals gave 
a 100% correct response to a series of 
tone presentations when the hearing- 
loss dial was set at maximum attenua- 
tion, that is, minus 10 db. This result 
means that the minus 10 db which was 
recorded as the pure-tone threshold 
was not a good estimate of the actual 
auditory sensitivity re the current 
American Standard. This problem could 
be overcome if an auxiliary attenuator 
pad were installed so that thresholds 
lower than minus 10 db re audiometer 
zero could be measured. 

During a test session each subject 
was interviewed, and a comprehensive 
history questionnaire was completed for 
him. 


Results and Discussion 


The audiometric data were analyzed 
in order to determine the relative hear- 
ing levels' of right and left ears within 


*The term hearing level is used here in the 
sense suggested by Davis, Hoople, and Par- 
rack (4) and refers to ‘. . . the deviation in 
decibels of an individual’s threshold of hearing 
from the American Standard value for the 
reference zero for audiometers.’ 
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and between the four groups under 
study. The relationship between air- 
conduction and bone-conduction re- 
sponses was assessed to establish the 
incidence of perceptive, conductive, and 
mixed-type hearing loss among person- 
nel who had Class B and Class CAF 
hearing and among ears that were classi- 
fied as Type B and Type C. 

Median and Mean Thresholds for 
Right Ears and Left Ears. Twenty- 
fifth percentile, median, and seventy- 
fifth percentile hearing levels for right 
ears and left ears of Class Ann, A, B, 
and C groups are shown in Table 1. 
The median age and the age range for 
each group also are included. 


In order to compare two estimates of 
central tendency, mean and median 
hearing levels for the four groups were 
calculated (Figure 1 and Table 1). For 
the Class Ann and Class A groups there 
was good agreement between median 
and mean thresholds at all test fre- 
quencies in both right and left ears. In 
the Class B group similar agreement was 
noted from 250 cps through 2000 cps. 
However, in the Class B group the 
effect of extreme losses from 3000 cps 
through 6000 cps in some cases influ- 
enced the mean threshold value and 
made it considerably larger than the 
median value; for example, at 6000 cps 
the median was 23.8 db, and the mean 
was 30.1 db. There was relatively good 
agreement between median and mean 
thresholds at all test frequencies in the 
Class C group. With the exception of 
the high frequencies in the Class B 
group, the medians and means approxi- 
mated each other. In other words, the 
differences between medians and means 
at the various test frequencies were 
negligible. 


The differences in hearing levels 
between right and left ears within each 
group were analyzed. Hearing levels 
in right ears and left ears were grouped 
according to greater or less loss than the 
median loss for both ears and were 
tested by chi square. In general, there 
were no significant differences at the 
5% level (or better) between right and 
left ears within Classes Ann, B, and C 
when median-threshold differences 
were tested by chi square and when 
mean-threshold -differences were tested 
by the ¢ test (9, 5). However, in the 
Class A group, significant differences 
between right- and left-ear thresholds 
were observed at 2000 cps and at 3000 
cps. 

Mean air-conduction and bone-con- 
duction audiograms for Classes Ann, A, 
B, and C are shown in Figure 1. In 
considering the differences between 
Class A (or Ann), B, and C, one must 
expect the reported thresholds to differ 
from each other at some frequency or 
frequencies since the criterion of selec- 
tion is based upon differences in hearing 
levels. The similarities as well as the 
differences between groups are apparent 
in Figure 1. In general, the thresholds 
for the right and left ears in Class Ann 
and Class A appear to be the same. 
For right-ear or left-ear thresholds, dif- 
ferences resulting from a comparison 
of the B group with the Ann and A 
groups were not statistically significant 
for frequencies 250 through 2000 cps 
but were significant for 3000, 4000, and 
6000 cps. When the Ann and A groups 
were contrasted with the Class C group, 
the differences between thresholds at 
the various test frequencies were all 
significant at the 1% level. Differences 
between the Class B group and the 
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Figure 2. Mean air-conduction thresholds for 
93 Type B ears and for 36 Type C ears of 
50 Class B individuals and 25 Class C individ- 
uals. 


Class C group were significant from 
250 cps through 3000 cps, but not at 
4000 cps and at 6000 cps. In other 
words, the Class A groups were similar 
to the Class B group in the lower fre- 
quencies, and the Class B group was 
similar to the Class C group at the two 
highest test frequencies. 

Figure 2 shows mean air-conduction 
thresholds for 93 Type B ears and for 
36 Type C ears of 50 Class B indi- 
viduals and 25 Class C_ individuals. 
When these threshold results are com- 


Tasie 2. Incidence of Type A, Type B, and 
Type C hearing in right ears and left ears of 
50 Class B individuals and 25 Class C in- 
dividuals, 











Right Ear Left Ear Total 
A B Cc 
A 13 2 15 
B 3 34 9 46 
C 3 0 11 14 
Total 6 47 22 75 








pared to the mean thresholds of right 
and left ears of Class B and Class C in- 
dividuals which appear in Figure 1, the 
effect of considering class of individual 
rather than type of ear can be seen. 
From 250 cps through 2000 cps the 
mean threshold for Type B ears is 
approximately 5 db, and at 4000 and 
6000 cps it is about 40 db. It must be 
remembered that some of the Type B 
ears occurred in Class C individuals. In 
general, the audiometric contour is dis- 
placed downward an average of about 
7 db when the mean thresholds of Type 
B ears are contrasted with the mean 
thresholds of combined right and left 
ears of Class B individuals. When the 
mean thresholds of right and left ears 
in Class C individuals in Figure 1 are 
compared with the mean thresholds of 
Type C ears in Figure 2, it can be seen 
that the audiometric contour remains 
about the same. The differences be- 
tween these mean thresholds range from 
about 11 db at 250 cps to about 17 db 
at 6000 cps when the Class C thresholds 
in Figure 1 are compared to those 
which appear in Figure 2. 


Table 2 shows the incidence of Type 
A, Type B, and Type C hearing in right 
and left ears of 50 Class B and 25 Class 
C individuals. In the Class B group 34 
individuals had binaural Type B hear- 
ing, and there were 84 Type B ears in 
this group of 50 Class B individuals. In 
the Class C group, 11 had binaural Type 
C hearing, and there were 36 Type C 
ears in this group of 25 Class C indi- 
viduals. Proportionally, there was more 
binaural Type B hearing in the Class 
B group than binaural Type C hearing 
in the Class C group. 


Speech-reception threshold data were 
not accumulated from the subjects used 
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Ficure 3. Median hearing losses in right and left ears of (a) an age-selected Class A non- 
noise-exposed Bergstrom AFB group, (b) Air Force recruits at Lackland AFB, and (c) a 
selected group of males in the 1954 Wisconsin Hearing Survey. 


in this study. The relationships between 
pure-tone thresholds and _ predicted 
speech-reception thresholds, however, 
are fairly well known (2). Since the 
500 to 2000 cps average closely ap- 
proximates the speech-reception thresh- 
old, no significant reduction in speech- 
reception ability should be expected as 
a result of high-frequency loss in the 
Class B group. The effect of the high- 
frequency loss on speech discrimination 
in this group would need to be estab- 
lished before the effect on social ade- 
quacy could be described. In other 
words, additional auditory tests would 
be necessary before the probable need 
for aural rehabilitation could be iden- 
tified. 


The estimated binaural speech-recep- 
tion threshold of the Class C group as 
a whole would be approximately 17 db. 
The binaural Type C individuals would 
have speech-reception thresholds in ex- 
cess of 20 db. In an earlier study by 
Kopra and others (7) the incidence of 
Class A, Class B, and Class C hearing 
was established for a group of 996 Air 
Force flight-line personnel: Class A, 
49%; Class B, 47%; and Class C, 4%. 
In the present study, 11 of 25 Class C 
individuals had binaural Type C ears. 
In the 996 flight-line personnel in the 
Kopra study, the incidence of binaural 
Type C hearing among flight-line per- 
sonnel was approximately 1.8%. The 
medical reversibility among this latter 
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group should be studied before state- 
ments concerning the probable need 
for aural rehabilitation can be made. 
The recently inaugurated hearing con- 
servation program in the Air Force 
ought to identify Class C hearing among 
Air Force recruits and among active 
service personnel so that remedial at- 
tention can be given. The subsequent 
disposition of individuals identified as 
having binaural Type C hearing should 
reduce the incidence of binaural Type 
C hearing among Air Force personnel. 
Therefore, the effect that the hearing 
conservation program has in reducing 
this incidence should be taken into ac- 
count if estimates of binaural Type C 
hearing among flight-line personnel are 
based on these results. 


Figure 3 shows a comparison of me- 
dian thresholds in right and left ears 
of three groups of young males: (a) 
an age-selected Class A non-noise-ex- 
posed Bergstrom AFB group in the 
present study, (b) Air Force recruits at 
Lackland AFB as reported by O’Con- 
nell (8), and (c) a selected group of 
males in the 1954 Wisconsin Hearing 
Survey as reported by Glorig and 
others (6). The median thresholds of 
the Bergstrom group and the Lackland 
group are close to each other from 500 
cps through 2000 cps and at 6000 cps. 
When compared to the selected group 
of Bergstrom AFB males, the Lackland 
male :ecruits had better median thresh- 
olds at 3000 cps and 4000 cps. At this 
time it is difficult to determine the im- 
portance of this difference. With one 
exception (at 4000 cps), the Bergstrom 
AFB group of non-noise-exposed young 
males had better median thresholds than 
the Wisconsin selected normal group. 
It is very probable that the psycho- 


physical method used in the measure- 
ment of threshold hearing accounts for 
the consistent threshold differences be- 
tween these groups. Before one can 
meaningfully compare and evaluate the 
differences between two or more sets of 
data, obviously the effects of different 
psychophysical methods and all other 
test variables must be taken into ac- 
count. 

Types of Hearing Loss. The diag- 
nosis of the type of hearing loss among 
job-noise-exposed personnel is impor- 
tant since significant temporary or per- 
sistent threshold shifts may have medi- 
cal, job-placement, and rehabilitational 
implications. The pure-tone audiometric 
thresholds established for each individ- 
ual in this study revealed the hearing 
level for that individual. Since test- 
retest threshold differences were not 
available from these data, no meaning- 
ful interpretation could be attached to 
hearing levels which deviated signifi- 
cantly from the American Standard 
value for reference zero in audiometers. 
However, it is worthwhile to note the 
incidence of the types of hearing loss 
among Class B and Class C individuals. 
Table 3 shows the number of ears diag- 
nosed as having conductive, perceptive. 
and mixed-type hearing loss in Class B 


Taste 3. Number of ears diagnosed as having 
conductive, perceptive, mixed-type, and in- 
definite hearing loss in Type B and Type 
CAF ears. 











Type of Type of Total 
ar Hearing Loss 
Cond Perc Mix Ind 
B bi 93 D! 0 103 
CAF 4 16 3 1 24 
Total 9 109 8 1 127 
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Taste 4. Number of right and left ears diag- 
nosed as having conductive, perceptive, mixed- 
type, and indefinite hearing loss in Type B 
and Type C ears. 





Total 


Type of 
Ear 


Type of 
Hearing Loss 
Cond Perc Mix Ind 


5) 5) 5) 
Right Ear B 2 ™ z 0 se 
Cc 3 9 1 0 13 


Total 5 51 3 0 59 


: 3 0 7 

Left Ear " . ? 
1 12 5 3 21 

Total 4 56 5 3 68 


and Class CAF (AFR 160-3 definition 
of Class C) groups. One Class C in- 
dividual had a significant nonorganic 
component and is, therefore, not repre- 
sented in Table 3. The type of hearing 
loss was diagnosed for 127 ears. The re- 
maining 21 ears of Class B and Class C 
individuals were Type A ears and, 
therefore, were not diagnosed as having 
conductive, perceptive, or mixed type 
of hearing loss. For the total 127 right 
and left ears in Class B and Class C 
noise-exposed individuals (excluding 
one indefinite), the approximate per 
cent of each type of hearing loss is as 
follows: 7% conductive, 87% percep- 
tive, and 6% mixed-type hearing loss. 

The types of hearing loss in right ears 
and left ears which fell into Type B and 
Type C (500-2000 cps average of more 
than 15 db) categories are shown in 
Table 4. Of the total 127 ears consid- 
ered, 59 right ears and 68 left ears were 
identified in Type B and Type C 
groups. Right ears had approximately 
the same per cent of each type of hear- 
ing loss as did the left ears. Omitting 
three indefinite ears, 7% of the right 
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and left ears were diagnosed as having 
conductive hearing loss, 86% percep- 
tive, and 7% mixed type. This inci- 
dence is approximately the same as 
that observed for right and left ears 
in Class B and Class C individuals when 
the AFR 160-3 definition of Class C 
was used to define Class CAF hearing. 
The effect of using more than 15 db 
for the 500 to 2000 cps average com- 
pared to 20 db or more to define Class 
C and Class CAF hearing can be seen 
by comparing the incidence of Type 
CAF hearing (24 re AFR 160-3 
definition) in Table 3 and the inci- 
dence of Type C hearing (34 re defini- 
tion in the present study) in Table 4. 
Generally, perceptive-type hearing loss 
predominates among Class B and Class 
C noise-exposed individuals. However, 
it should be noted that a large propor- 
tion (approximately 13%) of these ears 
have either conductive or mixed in- 
volvement. This latter observation 
should be borne in mind when attempts 
are made to study the antecedent-con- 
sequent relationships between noise ex- 
posure and hearing loss. 


Summary 


Pure-tone air-conduction and bone- 
conduction audiometric tests were ad- 
ministered to four Air Force personnel 
groups: (a) 25 Class A (no hearing loss 
greater than 15 db in either ear, 500 
through 6000 cps), non-noise-exposed 
(Class Ann); (b) 25 Class A, noise- 
exposed; (c) 50 Class B (hearing loss 
more than 15 db in either ear at any 
frequency, 500 through 6000 cps, aver- 
aging not more than 15 db for 500, 
1000, and 2000 cps), noise-exposed; and 
(d) 25 Class C (500 to 2000 cps, average 
loss more than 15 db), noise-exposed. 
Results showed: good agreement in 
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general between median and mean 
thresholds; no _ differences between 
right and left ears within groups; Class 
Ann and A thresholds close to Ameri- 
can Standard reference normal; Class 
B deviations from Class A only at 
3000, 4000, and 6000 cps; Class C dif- 
ferent from others (except Class B, 
4000 and 6000 cps) at all test frequen- 
cies; conductive or mixed-type loss ap- 
proximately 13%, perceptive-type 87%, 
Classes B and C. 
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Anxiety and Hostility in Stuttering 


SEBASTIAN SANTOSTEFANO 


The importance of anxiety in the phe- 
nomenon of stuttering has been noted 
in many studies which have explored 
psychological aspects of stuttering and 
the general personality characteristics 
of stutterers. Whether projective tech- 
niques and questionnaires are used (3, 
16, 21, 24, 27); whether there is a com- 
bined approach such as case history, 
neurological, psychiatric, and psycho- 
logical examination (7); or whether 
measures are taken to evaluate the re- 
lationships between personalities of stut- 
terers and their parents (33), these 
studies make mention of the high de- 
gree of anxiety, shyness, and insecurity 
which seems to characterize the stut- 
terers studied. This same general ob- 
servation is made also by workers who 
have treated stutterers with psycho- 
therapy (8, 18, 28), and by those re- 
porting findings based upon routine 
clinical observations (25). Goodstein 
(12), in summarizing his recent review 
of research on stuttering and personal- 
ity stated, ‘When, on the other hand, 
they (stutterers) are compared with 
presumably normal individuals they do 
appear different, usually somewhat 
more anxious, tense and socially with- 
drawn.’ 

Studies concerned directly with anxi- 
ety and stuttering often have investi- 
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gated anxiety as it is associated with or 
generated by specific words and speak- 
ing situations (9, 17, 26, 30). But as 
Wischner (34) has stated in this regard, 
“We are concerned here with immediate 
cues (in words and speaking situations) 
which precipitate a change in the anxi- 
ety level of a stutterer. Whether stut- 
terers are generaily more anxious than 
nonstutterei’s is another problem capa- 
ble of investigation.’ Only a few studies 
have investigated directly general anx- 
iety of stutterers. The studies cited 
above bring attention to anxiety as a 
clinical observation or asa finding which 
is noted in addition to the main concerns 
of the study. One study by Boland (4) 
investigated ‘chronic or general anxiety” 
in stutterers as well as anxiety associated 
specifically with speaking. He reported 
that the stutterers showed significantly 
higher general anxiety than nonstut- 
terers as measured by two inventories. 
Goss (15) has reported findings which 
suggest an anxiety gradient in stutter- 
ing behavior. He found that the longer 
the time interval between the exposure 
of a stimulus word and the signal for 
the stutterer to say the word, the great- 
er the probability that the word would 
be stuttered. Goss concluded that the 
longer the time interval, the greater the 
anxiety and thus the higher the fre- 
quency of stuttering. Baron (2) found 
that a group of stutterers, required to 
speak a word following the uncondi- 
tioned stimulus, acquired a conditioned 
eyelid response more rapidly than a 
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group not required to say a word. 
The results were interpreted as an 
indication that the anxiety generated 
by having to say a word contributed 
to the total drive in the situation, caus- 
ing more rapid conditioning. 

Hostility and stuttering have not re- 
ceived the same widespread attention 
as have anxiety and stuttering. Two 
writers (18, 21) have singled out hostil- 
ity as being a significant dimension in 
understanding and dealing with stutter- 
ing. Abbott (1), in speculating that re- 
pressed hostility is an important factor 
in adult stuttering, posed the question, 
‘Are we to assume that the stutterer has 
only loving, tender feelings towards 
one whose mere presence may fill him 
with panic?’ Abbott argued that because 
of the great fear of being rejected and 
embarrassed, and because of feelings of 
inadequacy, the stutterer develops a 
great deal of hostility. The hostility is 
then repressed because of the stutterer’s 
need for the listener and the desire to 
be accepted and liked. 


The present study is concerned with 
whether stutterers and nonstutterers can 
be differentiated on the basis of anxiety 
and hostility, and whether this anxiety 
and hostility add a relatively more dis- 
ruptive influence to the functioning of 
stutterers than of nonstutterers. It was 
predicted that, compared with non- 
stutterers, stutterers would show more 
anxiety and hostility as measured by a 
published rating scale of the Rorschach 
test. It was also predicted that stutterers 
would be made more anxious and hos- 
tile, under laboratory induced stress, 
than nonstutterers and that this anxiety 
and hostility would have a disruptive 
effect on a previously learned task. 
These predictions are based not only on 


the literature but also on the assump- 
tion, noted by others (1), that speech 
disorders frequently assume a negative 
social stimulus value and, therefore, the 
personality makeup of the speech- 
handicapped person is partially influ- 
enced by the responses of others to 
his speech problem. Thus, it is assumed 
that behavior and reactions classified as 
anxious and hostile are functionally re- 
lated to the experience of stuttering and 
of being a stutterer. This study does 
not take a stand on the question of 
whether anxiety and hostility are re- 
sults of stuttering or causes of stutter- 
ing. It is hypothesized only that a 
relationship exists which has significance 
for the understanding and treatment of 
this phenomenon. To clarify this, a dis- 
cussion of the concepts of anxiety, 
hostility, and stress, as used in this 
study, seems indicated at this point 
since they do not have fixed meanings 
in behavioral inquiry. 


Anxiety and hostility are used here 

as constructs which imply emotional 
* states and systems of tension triggered 
iby some stimulus. Anxiety is defined 
as apprehension cued off by a threat to 
anything which the individual holds 
essential to his existence as a personality 
(22). The threat may be to physical or 
psychological life or to any value which 
the individual identifies with his exist- 
ence. Some special characteristics of 
anxiety are feelings of uncertainty and 
helplessness in the face of the threat. 
Hostility is the anger, resentment, and 
enmity cued off by the same kind of 
threat. Important to this study is the 
assumption that anxiety and hostility 
are interrelated, one affect usually gen- 
erating the other. First, anxiety usually 
gives rise to hostility. That is, anxiety 
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with its concomitant feelings of help- 
lessness, fear, and conflict, causes a 
painful experience and a person tends 
to be angry and resentful toward those 
responsible for placing him in such a 
situation. This hostility in turn gener- 
ates more anxiety, and so on, each tend- 
ing to reinforce the other. Stress, as 
used in this study, designates a situation 
or set of conditions which are presumed 
to threaten, provoke, or make stress- 
ful the behaving organism within the 
situation. It is also assumed, in line with 
the foregoing discussion, that anxiety 
and hostility are among the responses 
which stress or threat will evoke. Stress 
is considered synonymous with the dis- 
ruption of, or the disintegration of, an 
organism’s behavioral response config- 
uration which has been previously 
established, according to some criteria, 
as adequate. It is assumed that there 
would be no way of appraising this 
state of stress and threat were it not 
for the changes it produces in the re- 
sponse system of the organism (29, 
p. 35). 

~The assumption was accordingly 
made that stuttering and being a stut- 
terer places an individual in a fairly 
constant state of stress because of actual 
and continually imminent negative re- 
~ actions by the environment and because 
of the stutterer’s own evaluation and 
interpretation of the handicap in terms 
of his self-esteem, security, and identity. 
These conditions within the stutterer 
and the stutterer’s environment, and 
their interaction, constitute a threat to 
values which the stutterer holds essen- 
tial to his existence as a personality. It 
was hypothesized that this state of 
stress, under which the stutterer func- 
tions, ultimately results in predominant 


and enduring states of anxiety and hos- 
tility. Accordingly, compared to non- 
stutterers, stutterers should show that 
greater amounts of anxiety and hostility 
characterize their personality make-up 
and that these affective attitudes have 
a greater disruptive influence on their 
functioning and ability to deal with the 
environment. 

With these considerations in mind, 
the investigator chose two approaches 
for investigating the concepts of anxiety 
and hostility in stuttering. First, a pro- 
jective technique was used (the Ror- 
schach test) to obtain a measure of 
the degree of anxiety and _ hostility 
which characterizes an individual. Sec- 
ond, an attempt was made to ap- 
proximate, to some degree, a real-life, 
threatening situation which presumably 
would elicit states of anxiety and hos- 
tility within the subjects. The labora- 
tory situation was devised so that the 
disrupting and disintegrating influences 
of these emotional states on the behavior 
of the subjects could be assessed. A 
projective test was selected rather than 
other devices such as pencil and paper 
inventories because of the various ad- 
vantages presumed to be offered by 
projective testing (19). For example, 
self-assessment is minimized and there 
is free expression of preconscious and 
unconscious needs, feelings, conflicts, 
and adaptive techniques. However, be- 
cause of the difficulties in using the 
Rorschach as a research tool to yield 
specific personality measures or scores 
not yet satisfactorily validated, a rating 
scale with some demonstrated validity 
was used to analyze the projective data. 


Method 


Subjects. Two groups of subjects 
were used: a group of 26 stutterers 





340 Journal of Speech and Hearing Research 


and a group of 26 nonstutterers. Each 
group consisted of 20 males and six fe- 
males. The 26 nonstutterers were se- 
lected so that their sex, mean age, and 
mean IQ matched those of the stutter- 
ing group. The mean age of both 
groups was approximately 20 years, with 
a standard deviation of 2; the mean IQ 
of both groups was approximately 121, 
with a standard deviation of 7. 


Rorschach Content Test. In 1949, 
Elizur (10) reported a validation study 
of a new method for deriving anxiety 
and hostility scores from the content 
of Rorschach protocols. The method 
was called the Rorschach Content Test 
(RCT). Rather than analyzing such 
traditional aspects of a subject’s re- 
sponse to a Rorschach card as the area 
of the blot used, the use of shading 
nuances, and the presence of color, 
Elizur examined only the content of 
the responses, and the action and affect 
assigned to this content. In constructing 
and conceptualizing the technique, Eli- 
zur made use of a main fundamental 
assumption underlying projective test- 
ing and the meaning of responses ob- 
tained by this method. He stated that 
a response is ‘the integration into a 
single experience of present stimuli 
with past experience. In other words, 
perception is more than a pure cogni- 
tive function of the individual; it rep- 
resents the product of the integration of 
many aspects of the total personality, 
including needs, strivings and emotions.’ 

Elizur’s scoring system yielding anxi- 
ety and hostility scores is as follows: 
where unveiled anxiety is explicitly ex- 
pressed ‘A’ is assigned to this response 
which is given a weight of two. Ex- 
amples receiving these scores are: ‘a 
fearful monster’ and ‘a little girl run- 


ning in terror.’ If a response explicitly 
expresses hostility the capital letter | le 
is assigned and this response also re- 
ceives a weight of two. For example, 
‘two men fighting’ and ‘a tiger leaping 
on its prey.’ However, if an expression 
of anxiety or hostility is less overt or 
symbolic, a small letter ‘a’ or = 2 
assigned, as the case may be, and is 
given a weight of one. To illustrate 
the scoring system, Elizur gives de- 
tailed examples which are grouped into 
six categories; for example, symbolic 
responses (as, dead leaves); cultural 
stereotypes of fear and hostility (as, 
snakes, blood); objects of aggression 
(as, gun, teeth). An individual’s scores 
in both anxiety and hostility are the 
totals of the weights assigned to ‘a, A’ 
and ‘h, H,’ respectively. 

Elizur used questionnaires, self-rat- 
ings, and interview material as criteria in 
his validation study, the results of which 
support the contention that the RCT 
is a valid rating scale. From the criteria, 
Elizur concluded that individuals with 
high anxiety scores ‘tended to suffer 
from fears or phobias, were inclined to 
worry excessively and generally lacked 
confidence in themselves. And, indi- 
viduals with high hostility scores were 
persons who showed up as building 
up within themselves a considerable 
amount of resentment against others. 
In his overt behavior, he might rather 
demonstrate special kindness and sub- 
missiveness but the internal accumu- 
lated hostile feelings would tend to 
keep him aloof from any warm inter- 
personal relationships.’ Studies investi- 
gating the reliability of the RCT 
scoring method yielded coefficients 
ranging from .77 to .93. 

Elizur’s RCT was applied recently 
by Gorlow, Zimet, and Fine (14) in 
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a study investigating the assumption 
that delinquents are generally con- 
sidered to be anxious, hostile, and 
threatened individuals. They found that 
delinquents obtained significantly high- 
er anxiety and hostility scores than non- 
delinquents. 


Laboratory Stress. In general, the 
purpose of this phase of the investiga- 
tion was to introduce stress by means of 
spoken ‘threat words’ and by requiring 
subjects to free associate to these words. 
The effect of this stress would be de- 
termined by noting the decrement in 
the subject’s performance on a previ- 
ously learned task. Ever since the 
experiments by Jung (20), who estab- 
lished the relationship between emotion- 
ality and speed of verbal association, 
many studies have concerned them- 
selves with the perception of and reac- 
tion to emotionally charged stimuli. 
Investigating the effect of emotion on 
perception, workers (5, 6, 23) have 
utilized the procedure of presenting 
threat (emotionally toned) words and 
neutral words to subjects by means of 
a tachistoscope and noting the differ- 
ences in recognition thresholds. It was 
decided that for this study the emotion- 
ally toned words used by these workers 
would be used in two ways to create 
a stressful condition. The investigator 
would speak these words and would 
require the subject to free associate to 
the word, writing down the first word 
that came to his mind. It was hypothe- 
sized that if this condition was suffi- 
ciently stressful, it would create enough 
anxiety and hostility to cause a signifi- 
cant delay in the recall of previously 
learned responses whose stimuli fol- 
lowed these threat words as compared 
to the time required to recall responses 


whose stimuli followed neutral words. 
Thus, it was hypothesized that threat 
words would create enough anxiety and 
hostility to cause a decrement in per- 
formance of both stutterers and non- 
stutterers. It was further hypothesized, 
however, that stutterers would be made 
more anxious and hostile than nonstut- 
terers by this stressful condition and 
would, therefore, show a greater decre+ 
ment in performance. Therefore, if both 
the stutterers and nonstutterers learn a 
response to a specific stimulus, and if 
then they are required to produce this 
response immediately following their 
association to a threat word, the stut- 
terers being more disrupted and dis- 
turbed by the stress will require 
significantly more time to recall and 
produce this response. 


To investigate these assumptions, the 
following paradigm was developed. The 
initial condition consisted of a learning 
task of 12 pairs of digits and symbols 
taken from the Wechsler-Bellevue Digit 
Symbol Test, Forms I and II, and pre- 
sented according to the paired-associate 
technique by means of a screen and 
projector. These digits and symbols 
were chosen because they approximated 
nonsense material, thereby controlling 
in part for previous learning and experi- 
ence. The digits of each of these pairs 
constituted the stimuli, and their respec- 
tive symbols constituted the responses. 
The procedure for the learning condi- 
tion was as follows: a digit was flashed 
on a screen for a period of two seconds. 
This was followed by the digit’s re- 
sponse (that is, symbol) which was 
also flashed for two seconds. After 
seeing the list of pairs several times in 
this manner, when a subject saw a digit 
whose symbol he recalled, he would 
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write this symbol on the paper pro- 
vided, before the symbol appeared on 
the screen. The subject was shown the 
digit-symbol pairs until he was able to 
produce the symbol paired to each 
digit upon seeing the digit. ‘To complete 
this learning task, all subjects were re- 
quired to satisfy the following criteria: 
to respond within the two-second inter- 
val during which the digit was pre- 
sented on the screen and to respond 
through two, su :cessive, errorless trials 
or presentations of the list of 12 pairs. 
Thus, the two groups of subjects 
learned the list of digits and symbols 
equally well and were approximately 
equated for rate-of-response in that 
they all were able to recall and write the 
symbol within the two seconds during 
which the digit was shown. To prevent 
serial learning, several film strips were 
used, each one containing the digit- 
symbol pairs in a different order. 


Immediately after the initial learning 
was completed, the stress condition was 
introduced and consisted of 12 succes- 
sive sequences. The procedure for each 
sequence was as follows: The experi- 
menter said a word; the subject asso- 
ciated to this word, writing the first 
word that came into his mind. It should 
be noted that the instructions preceding 
this experimental condition emphasized 
that the experimenter would read and 
note the subject’s association to the 
word. This was done in order to in- 
crease the probability that the threat 
words and the subject’s association to 
them would place the subject under 
stress, making him anxious and hostile. 
Immediately after writing this word, 
the subject looked at the screen on 
which was flashed, for two seconds, 
one of the digits from the list of digits 


and symbols which he had just learned. 
Upon seeing the digit, the subject 
wrote the symbol paired to that digit. 
The experimenter recorded the time 
which elapsed between the presentation 
of the digit and the writing of the 
symbol by the subject. A period of 
90 seconds separated the end of one 
sequence from the presentation of the 
next word which began the following 
sequence. 

The list of 12 words which intro- 
duced the 12 sequences consisted of 
six threat words and six neutral words 
which were equated for frequency of 
use and number of letters by means of 
Thorndike and Lorge’s word-frequency 
tables (31). Thus, the threat word 
‘whore’ was equated in terms of fre- 
quency of use and number of letters by 
the neutral word ‘quota.’ The remain- 
ing five threat words and five neutral 
words were, respectively penis, mastur- 
bation, intercourse, tit, Kotex; radio, 
extravaganza, predecessor, ace, circle. 


* To control for adaptation, fatigue, 
and the effect of the physical charac- 
teristics of the digit-symbol pairs on 
retention of these pairs, both groups of 
subjects were divided into two equal 
parts for the test condition. The second 
half of each group was presented the 
digits in a sequence the reverse of that 
for the first half; and those digits which 
followed threat words for the first half 
of each group followed neutral words 
for the second half. 


Somewhat related to this experiment 
of performance under stress is one 
study by Westrope (32) which investi- 
gated the relationship between manifest 
anxiety and changes in performance 
under stress (being watched while per- 
forming the Wechsler-Bellevue Digit 
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Symbol Test) and between Rorschach 
indices and manifest anxiety. It was 
found that six Rorschach measures 
(including the RCT anxiety score) dif- 
ferentiated between anxious and non- 
anxious subjects as determined by the 
Taylor Manifest Anxiety Scale. Stress 
significantly impaired the subjects’ per- 
formance but none of the Rorschach 
indices of anxiety correlated signif- 
icantly with the subjects’ decrement 
in digit-symbol performance. It should 
be noted that Westrope’s study used the 
digit-symbol test in the usual manner 
while the present study used it as a test 
of recall rather than of speed of coding. 

Procedure. All subjects were seen in- 
dividually for each of the three main 
phases of this investigation. The Otis 
Test of Mental Ability, Higher Exami- 
nation: Form A, was administered first. 
Several days later, this same subject was 
administered the Rorschach. Approxi- 
mately one week later, the subject was 
required to perform in the laboratory 
stress situation. In all cases, the subjects 
completed their performance in each of 
the three sessions within a two-week 
period. The 26 stutterers were chosen 
from among volunteers, each volunteer 
matched with a stutterer on the basis 
of IQ, age, and sex. Initially the IQ 
test was given to a larger number of 
nonstutterers to insure that 26 would 
be found who not only would meet the 
IQ, age, and sex requirements but who 
also would agree to take part in the 


343 


study. As it developed, the first 26 who 
met the requirements were willing to 
participate in the study. It would seem, 
therefore, that the nature of the control 
group was not contaminated by a se- 
lection factor. 

The 52 Rorschach protocols obtained 
from the stutterers and nonstutterers 
were typed and given code numbers in 
place of names. Using Elizur’s method 
to score the content of the Rorschach 
protocols, the author obtained anxiety 
and hostility scores for each subject. 
The coding disguised whether a 
protocol was that of a stutterer or non- 
stutterer. Although the examiner admin- 
istered the Rorschachs, it is felt that the 
coding, the number of Rorschachs in- 
volved, and the time which elapsed be- 
tween administering the tests and rating 
them, resulted essentially in blind scor- 
ing of the protocols. 


Results and Discussion 


On the Rorschach Content Test 
(Table 1) the stutterers obtained signif- 
icantly higher anxiety and _ hostility 
scores than nonstutterers, the differ- 
ences being significant in both cases at 
the 1% level. It is especially note- 
worthy, because of this result, that the 
stutterers averaged fewer responses to 
the 10 Rorschach cards than did the 
nonstutterers. The stutterers’ mean 
number of Rorschach responses was 28 
and that of the nonstutterers was 31. 
The results of Elizur’s validation study, 


Tasie 1. Mean Rorschach Content Test (RCT) anxiety and hostility scores of stutterers and 


nonstutterers and p values of the differences. 











Scores Nonstutterer Stutterer t p 
Mean SD Mean SD 

RCT Anxiety bey i 2.68 10.19 4.78 4.05 01 

RCT Hostility 1.96 1.83 4.81 2.96 4.09 01 
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Taste 2. Mean reaction-time differences in seconds of stutterers and nonstutterers in labora- 
tory stress condition and p values of these differences. 











Group Diff SD t p 
Nonstutterers 73 86 4.24 01 
Stutterers 1.25 98 6.41 01 














reported above, offer an interpretation 
for these findings. By obtaining signif- 
icantly higher anxiety and _ hostility 
scores than nonstutterers, the stutterers 
seem to show that they have less self- 
confidence, suffer more from fears, are 
more anxious, and tend to worry more 
than nonstutterers. Furthermore, the re- 
sults suggest that the stutterers have 
accumulated within themselves a con- 
siderable amount of resentment against 
others, as well as hostility. In consider- 
ing these findings, it should be noted 
that the Rorschach testing required 
verbal responses which could have 
increased the situational stress for stut- 
terers and, therefore, may have ac- 
counted for some of the differences 
observed. However, the investigator is 
of the opinion that this possibly greater 
situational stress for the stutterers could 
not have produced, alone, the magni- 
tude of the differences obtained. 

To evaluate the subjects’ perfomance 
under the stress condition, the follow- 
ing measure was computed. Each sub- 
ject obtained six reaction times to 
stimuli (numbers) which followed their 
associations to threat words and six 
reaction times to stimuli which fol- 


lowed their associations to neutral 


words. The respective means of these 
time measures, under the threat and 
nonthreat conditions, were computed 
for each subject as was the difference 
between the means. The mean reaction 
time in threat conditions was greater 
than the mean reaction time in neutral 
conditions for all subjects. It was 
assumed that the difference between 
these means represented the degree 
to which a subject’s previously learned 
responses were disrupted by the anx- 
iety and hostility created by the 
stress. This measure will be referred to 
as mean reaction-time difference. The 
results of the investigation of perform- 
ance under stress are contained in 
Tables 2 and 3. As Table 2 indicates, 
both groups required significantly more 
time to recall the responses to digits 
following the subject’s associations to 
threat words than to recall responses 
in the neutral trials. Nonstutterers re- 
quired, on the average, .73 sec more 
to respond in threat conditions versus 
neutral conditions, and the stutterers 
a mean average of 1.25 sec more. Both 
of these differences are significant at 
the 1% level. These results support 
the assumption that the threat words 
as used here created a situation or set 


Taste 3. Difference in seconds between the stutterer and nonstutterer groups’ mean reaction- 
time difference in laboratory stress condition and p value of the difference. 
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Nonstutterers 73 
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of conditions which could be called 
stressful as that term is defined in this 
study. That is, hearing a threat word 
and having to free associate to it 
aroused anxiety and hostility which in 
turn disrupted previously learned re- 
sponses, causing a significant decrement 
in the efficiency of all subjects. How- 
ever, stutterers required significantly 
more time to respond after their associ- 
ations to threat words, as compared to 
neutral words, than did nonstutterers 
(Table 3). This result supports the 
hypothesis that the stutterers were made 
more anxious and hostile by the stress- 
ful condition than were the non- 
stutterers and that this caused a greater 
disruption and disintegration of previ- 
ously learned behavior for them than 
for the nonstutterers. In considering 
this interpretation of the results, it 
should be recalled that both groups 
learned the pairs of digits and symbols 
equally well and both acquired a fairly 
equal and rapid rate of response. That 
is, during the learning trials, all subjects 
eventually were able to produce the 
symbol to any digit within two seconds. 
Furthermore, it can be assumed that the 
threat and neutral words were equally 
familiar to both groups on the average. 


During the stress and nonstress con- 
ditions, some subjects either failed to 
recall a particular symbol, produced a 
symbol which was not the correct re- 
sponse to the digit, or wrote the cor- 
rect symbol but in a rotated or re- 
versed position. All of these responses 
constituted a failure. When these fail- 
ures are examined, there are no sig- 
nificant differences, either between neu- 
tral and threat conditions for any one 
group, or between subject groups. 
Actually, a relatively small number of 


symbols was recalled incorrectly. A 
total of only nine incorrect symbols 
was produced by the stutterers and a 
total of 13 by nonstutterers of the 312 
possible total number of symbols for 
each group. 


In summary, the results of this study 
seem to support the contention that 
stutterers are generally more anxious 
and hostile than nonstutterers as meas- 
ured by a projective test and by per- 
formance under laboratory-induced 
stress. Stutterers projected more anxiety 
and hostility in their Rorschach content 
either through emotions and attitudes 
expressed or implied (a dangerous crev- 
ice, an angry face) or through expres- 
sive behavior (a retreating animal, two 
animals fighting) or through symbolic 
responses or cultural stereotypes (dead 
leaves, snakes, spiders, blood) or 
through objects representing, for 
example, aggression (gun, teeth). Fur- 
thermore, when asked to recall 
previously learned material, under con- 
ditions presumed to create consider- 
able anxiety and hostility, the stutterer 
showed significantly more disintegra- 
tion and inefficiency. One important 
factor in this latter, more real-life, ex- 
perimental situation is that the stut- 
terer’s anxiety and hostility seem not to 
have been generated by or associated 
with a speaking situation. The stutterer 
was speaking neither to nor with nor for 
an individual but was simply attempting 
to recall previously learned material 
under the special stress and neutral situ- 
ations. The findings obtained do seem to 
argue that the disruption observed in 
this laboratory situation represents the 
influence of general anxiety and hostil- 
ity which perhaps has become part of 
the general character make-up and 
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mode of functioning of the stutterers 
investigated. 


It is proposed that anxiety and hos- 
tility not be viewed as separate, perhaps 
independent dimensions related to the 
phenomenon of stuttering, as the litera- 
ture would suggest, but as interrelated 
emotional states or systems of tension 
which characterize the stutterer and 
which can exert a significant influence 
on his ability to operate effectively 
under stress and within interpersonal 
situations. On the basis of the results of 
this study, the theory seems tenable that 
most stutterers develop, by the time 
they are young adults, a predominant, 
enduring emotional disposition charac- 
terized by general anxiety and hostility 
and which has resulted from the stut- 
terer’s having to function throughout 
childhood under almost continual stress 
and threats to his existence as a per- 
sonality. The stutterer, then, enters 
many situations predisposed to mobiliz- 
ing this basic affective attitude. This 
attitude or emotional disposition very 
likely is a detriment to the stutterer’s 
personal adjustment and efficiency of 
functioning. Whether it leads to neu- 
rotic techniques or inefficient adaptive 
defenses, and the conditions under 
which this might occur, are seen as an- 
other set of issues apart from those con- 
sidered here. 


Two practical applications in the un- 
derstanding and treatment of stuttering 
are suggested by the results. First of 
all, Elizur reports that special training 
and knowledge in Rorschach testing is 
not required to administer and interpret 
the RCT. Therefore, it would seem 
that workers in the field of speech cor- 
rection could make use of this tool 
in obtaining some assessment of the 


emotional disposition which this study 
suggests characterizes stutterers. The 
second possible application follows, in 
that it would seem that workers doing 
either speech therapy or psychotherapy 
with stutterers should take into account, 
directly, this emotional disposition in 
understanding and treating the partic- 
ular stutterer. 


Summary 


A conceptual framework was dis- 
cussed relating anxiety, hostility, and 
stress with stuttering. The Rorschach 
test was administered to stutterers and 
nonstutterers and was rated with a 
validated scale for anxiety and hostility. 
In addition, all subjects were asked to 
recall previously learned material in 
neutral conditions (that is, following 
their free associations to neutral words 
spoken by the investigator) and in 
stressful conditions (that is, following 
their free associations to emotionally 
toned words). The results showed that 
the stutterers projected on the Ror- 
schach significantly more content indic- 
ative of anxiety and hostility than did 
nonstutterers. In the laboratory situ- 
ation all subjects showed a significant 
decrement in performance under stress 
conditions as compared to neutral ones, 
but stutterers showed a significantly 
greater decrement than nonstutterers. It 
was theorized that by adulthood a stut- 
terer develops an enduring emotional 
disposition characterized by general 
anxiety and hostility which interferes 
with his personal adjustment and effi- 
ciency of functioning. 
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Effects of Listener Sophistication 
upon Global Ratings of Speech Behavior 


ERNEST J. BURGI 


JACK MATTHEWS 


Much of the clinical research in speech 
correction depends in large measure up- 
on judges’ ratings of the speech behavior 
of suvjects. Examination of the research 
literature indicates that for this type of 
activity the researcher usually attempts 
to recruit raters who are sophisticated 
judges, that is, judges trained in the 
field of speech and hearing disorders. 
In most university settings where this 
work is done, such procedures place a 
burden on the limited number of grad- 
uate students in speech and hearing dis- 
orders and upon staff members in that 
area. The fact that there are many de- 
mands upon the time of both groups 
poses a problem for the researcher who 
must locate reliable judges for rating 
speech behavior. If it were possible to 
obtain adequate ratings of speech be- 
havior from unsophisticated listeners, 
the problem of recruiting raters would 
be less difficult, since the number of 
potential raters would be much greater. 

The study reported here is one of a 
series attempting to evaluate differences 
in judging ability among several groups 
of listeners with varying degrees of 
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training in speech pathology. Previous 
investigations in this series indicated 
that sophisticated and unsophisticated 
judges agree closely when called upon 
to make global ratings of speech be- 
havior. The term ‘global’ here refers to 
the over-all ‘goodness’ or ‘badness’ of 
the speech, as opposed to specific as- 
pects such as rhythm, articulation, and 
such. 


Schaef and Matthews (9), in a study 
concerned with the evaluation of stut- 
tering therapy, utilized both an ‘expert’ 
group and a ‘naive’ group of raters to 
make judgments of severity of stutter- 
ing, amount of tension in the voice, and 
amount of unfavorable attention the 
speech calls to itself. These authors 
found that the group of naive listeners 
did not differ from the experts in ability 
to validly rate the subjects. Validity 
coefficients were determined by using 
the ratings of the original clinician who 
handled the subject in therapy as the 
outside criterion. The interjudge re- 
liability was higher for the naive judges 
than for the experts. Schaef and Mat- 
thews concluded that ‘. . . from a prac- 
tical point of view the results of this 
study indicate that naive judges may 
be used with confidence that no greater 
validity and interjudge reliability would 
be obtained through the use of the less 
numerous experts.’ 
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In another study, Phillips (8) pro- 
vides a similar indication that sophisti- 
cated and unsophisticated listeners pro- 
duce similar ratings of the speech in- 
telligibility of cleft palate speakers. 
Phillips used two rater groups. One 
group consisted of ‘laymen,’ defined as 
people having no formal training in 
speech pathology. The other group 
were ‘speech pathologists,’ defined as 
people having extensive training and 
clinical experience in speech pathology. 

Perrin (7) designed a study for the 
purpose of investigating ‘. . . whether 
or not there is a similarity in the ratings 
of severity given by therapists and lay 
persons to a given speech defect.’ Per- 
rin’s subjects consisted of seven chil- 
dren who had been clinically diagnosed 
as having functional articulatory de- 
fects. Her untrained group of raters in- 
cluded 26 students enrolled in an intro- 
ductory psychology course and her 
trained group consisted of 13 graduate 
students in speech and hearing therapy. 
She found no significant differences in 
the ratings made by the two groups. 

The present study compares the 
speech rating abilities of groups with 
four different levels of sophistication 
frequently found in the university en- 
vironment. 


Procedure 


Speech samples used in the present 
study represented two readings of a 
paragraph by each of 22 subjects who 
had been clinically diagnosed as suffer- 
ing from multiple sclerosis. The samples 
were recorded on magnetic tape at vari- 
ous Veterans Administration hospitals 
in the United States. Speech ranged 
from very intelligible to almost unin- 
telligible. 


The same 22 subjects also provided 
data as part of a larger group participat- 
ing in a study (5, 6) of the effect of 
the drug, isoniazid, on multiple sclerosis. 
In that larger study, 12 of the 22 were 
in the experimental group, receiving 
drug therapy (isoniazid) and 10 were 
in the control group, receiving a place- 
bo. Recordings of the paragraph were 
made before and after administration of 
the isoniazid or the placebo. Record- 
ings of speech samples in addition to 
the paragraph were made in the larger 
study but these and the drug therapy 
are not considered in the present study. 
The rating procedures reported here, 
however, were designed to evaluate the 
effect of the drug as well as to compare 
the rating abilities of different listener 
groups. 

Following is the paragraph which was 
used: 

The trees set out to choose a king to 

rule over them. They asked the olive tree 

to be king, but the olive tree said, ‘I am 
busy making olives and oil; I can’t stop 
to be your king.” Then they asked the fig 
tree, but the fig tree said, ‘Do you expect 
me to leave off making my good fruit, 
just to be king over you?’ Then they 
asked the grapevine to be king, and the 
grapevine said, ‘Shall I leave my wine, 
that gives cheer and comfort to so many, 
only to be king?’ At last the thornbush 
was the only one they could get to be 
king. 
The paragraph was typed in capital 
letters. Subjects who could not read it 
for any reason, including visual diffi- 
culties, were not used in the present 
analysis. 

From the original recordings, the 
pairs of paragraph readings by the 22 
subjects were dubbed onto another tape, 
each subject identified by a number on 
the new tape which thus contained a 
series of 22 pairs of paragraphs. Each 
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pair contained the pretest and posttest 
samples of the same subject, but the 
order of the samples was determined 
randomly. Each speech sample (pair) 
was introduced with these words: “This 
is pair number ... , sample number . . 
.. A practice pair was recorded at the 
beginning of the tape. 

Four groups of listeners heard the 
speech samples which were prepared 
as indicated above. The makeup of the 
four groups was as follows: Group 1, 
10 students enrolled in a beginning uni- 
versity course in speech pathology, who 
had had no previous courses in speech 
pathology; Group 2, seven full-time 
college undergraduates, who were en- 
rolled in a second level daytime course 
in speech pathology; Group 3, 18 stu- 
dents, most of them teachers, with 
more listening experience than Group 
2 but enrolled in a night class of the 
same second level course in speech 
pathology; Group 4, seven graduate 
students in speech pathology. 


Speech samples were played to each 
listener group separately through the 
speaker of a tape recorder. Because the 
speech samples had been recorded in 
several different locations on different 
machines, no attempt was made to use 
high fidelity equipment in the listening 
sessions. The speaker was placed in a 
position approximately the same dis- 
tance from each listener. No special 
sound treatment was utilized in the 
listening rooms, but listening was done 
at times when external noise was at a 
minimum. 


Each group of listeners was asked to 
compare the second sample of each pair 
with the first sample of that pair and 
rate the second sample on a five-point 


Tasie 1. Rater reliability for listener groups. 











Source df ms r* Rt 
Group 1 
Between Subjects 21 4.17 
Between Listeners 9 1.28 
Residual 189 82 
29 80 
Group 2 
Between Subjects 21.02.94 
Between Listeners 6 122 
Residual 126 94 
23 68 
Group 3 
Between Subjects 21.07.45 
Between Listeners 17 64 
Residual 357 92 
28 88 
Group 4 
Between Subjects 21 S12 
Between Listeners 6 151 
Residual 126 79 
30 75 


Combined Groups 


Between Subjects 21 =: 13.18 
Between Listeners 41 1.02 








Residual 861 92 

.24 93 
*r = reliability of a single listener; from 
Ebel (3). 


+R = reliability of combined listeners; from 
Ebel (3). 


scale, from zero through four, as a 
great deal worse, a little worse, the same 
as, a little better, or a great deal better 
than the first sample. Instructions were 
given on a rating sheet provided for 
each listener and also were presented 
to the listeners orally by the experi- 
menter. After the instructions, the prac- 
tice pair was played to the listeners and 
time was provided for questions and 
clarification of the instructions before 
the ratings were made. 
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Results and Discussion 


The data were treated so that each 
rating represented the difference be- 
tween pretest and posttest for each sub- 
ject, rather than the difference between 
the first sample and the second sample. 
This was accomplished by reversing the 
ratings for each pair in which the first 
sample contained the posttest and the 
second sample the pretest. (This pro- 
cedure was necessary to make compari- 
sons not reported in this paper.) 

The rating data provided by each 
group of listeners for all subjects were 
analyzed in two ways: (a) the reliabil- 
ity of each group of raters was deter- 
mined, and (b) the differences among 
the ratings of each of the groups were 
evaluated. 

Adaptations of Fisher’s intraclass 
formula from analysis of variance have 
been used in previous studies to deter- 
mine rater reliability (1, 2). The use of 
this procedure was particularly suitable 
in this study because it enables a rela- 
tively rapid estimate of a single listener’s 
reliability from each group and also an 
estimate of the reliability of the com- 
bined members of each group. The 
rater reliability for each group of listen- 
ers is presented in Table 1. 

The reliability of a single listener’s 
rating is similar in each group. It is 
obvious, from an examination of these 
reliability coefficients for single listen- 
ers, that a prediction formula applied 
to these ratings, estimating reliabilities 
for a fixed number of listeners in each 
group, would yield similar correlations. 
The differences in the reliability of 
combined listeners in each group are 
probably due to differences in the num- 
ber of listeners in each group. When 
differences in the numbers of listeners 
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in each group are taken into account, 
the reliability of group ratings also ap- 
pears to be similar among the four 
groups. 


The combined rater reliability is 
higher than the reliability estimates for 
a single listener. Naturally, the reliabil- 
ity of the group increases with an in- 
crease in the number of listeners in the 
group, since the mean rater error must 
be less variable than the individual rater 
error. If the reliabilities are determined 
by combining all the listeners into one 
group the results are those shown in 
the last section of Table 1. In this case 
the reliability of a single listener’s rat- 
ings remains the same but reliability of 
judgments of the entire group is in- 
creased because of the greater number 
of listeners when the groups are com- 
bined. The reliability of combined rat- 
ings for each group in this study might 
be considered adequate for judgments 
of the type involved in this study. 


A repeated measurements analysis of 
variance design patterned after that 
presented by Edwards (4, p. 288) was 


Taste 2. Summary of analysis of variance to 











evaluate differences among four listener 
groups on mean ratings. 

Source df ms F 
Between Listener Groups 3 .96 94 
Between Listeners 38 —s:1.02 

in Same Group 

Between Subjects 21 13.18 14.98* 
Interaction: Subjects x 63 ~—-1.50 1.70* 
Listener Groups 

Interaction: Pooled 798 88 


Listeners x Subjects 








*Significant. 
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Taste 3. Correlations between mean ratings 
of each pair of listener groups. 











Listener Groups 1 2 3 
2 56 
3 64 59 


4 67 70 /4 








used for the purpose of evaluating the 
differences in the ratings given to the 
subjects by each group of listeners. 
Table 2 summarizes these data. The F 
for listener groups is not significant and 
suggests that the differences in the over- 
all level of the judgments were no 
larger than would be expected to arise 
by chance. This result seems to support 
the results obtained by other investiga- 
tors, reported earlier in this paper, who 
used other types of speech defective 
groups as subjects. 

The F for subjects, which is highly 
significant, merely indicates that the 
subjects used, as reported earlier, ex- 
hibited a wide range of speech abilities. 

The F based on the interaction be- 
tween subjects and listener groups is 
significant. This significant F could re- 
sult if the variances of the four groups 
were significantly heterogeneous. This 
possibility was examined by subjecting 
the data to Bartlett’s test of homogene- 
ity of variance (4). The variances were 
not significantly different from one an- 
other. The significance of this F sug- 
gests a possible conclusion that some 
listener groups evaluated subjects quite 
differently than others did even though 
all listener groups are similar in their 
mean ratings. In order to evaluate this 
possible conclusion, the mean ratings 
provided for each of the 22 subjects, by 
each listener group, were calculated. 


The means for each group were cor- 
related with means for every other 
group (Table 3). The correlations 
ranged from .56 to .74. All of them are 
significantly different from zero at be- 
yond the 1% level. Although correla- 
tions of this magnitude may not be 
considered extremely high, they do in- 
dicate an obvious positive relationship 
between the mean rating provided by 
each listener group and the mean rat- 
ings provided by every other group. 
The differences between these correla- 
tions are not statistically significant. 
Even if the difference between any two 
of these correlations were to prove sig- 
nificant, the differences are quite small 
for practical predictive purposes. The 
correlations between the means of 
Groups 2 and 4 and 3 and 4 are practi- 
cally identical. All the other correla- 
tions are somewhat lower and very 
similar to each other. It may be con- 
cluded from these data that each of the 
groups made similar global judgments 
of the over-all ‘goodness’ or ‘badness’ of 
the speech samples. The fact that these 
correlations are not high is probably 
due in part to poor recordings and in- 
adequate control over some of the data 
collection procedures. 


The significant interaction between 
listener groups and subjects, coupled 
with the relatively low correlations be- 
tween mean ratings provided by the 
different groups may indicate that some 
subjects are rated differently by differ- 
ent groups, but the analysis of these 
combined data suggests that the differ- 
ences are not consistent in any one di- 
rection for any listener group. These 
data further suggest that any differences 
in average ratings provided by different 
groups are small and not significant. 
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The data provided by the correlations 
between means supports the conclusions 
derived from the reliability estimates 
reported in Tables 1 and 2 that there 
is no superiority of one group over 
another in ability to make global ratings 
of the speech behavior of the subjects. 


Summary 


Connected speech samples from 22 
subjects with multiple sclerosis were 
recorded on tape and later evaluated 
by four groups of listeners. The listener 
groups differed with respect to the 
amount of training they had received 
in the area of speech and hearing dis- 
orders. An analysis of the differences 
among the mean ratings of each group 
of listeners indicated that there were 
no significant differences. Correlations 
between the mean ratings for each sub- 
ject by each group are significantly 
different from zero and not signifi- 
cantly different from each other. These 
findings are in agreement with previous 
researchers who have reported no sig- 
nificant differences between trained and 
untrained listener groups in terms of 
global or over-all ratings of the speech 
behavior of groups of subjects with 
speech problems. 
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Use of Hearing Aids by Young Children 


GEORGINA RUSHFORD 


EDGAR L. LOWELL 


Little is known of the actual use of 
hearing aids by young children. Their 
parents occasionally report satisfaction 
or dissatisfaction with a hearing aid, but 
the reports are anecdotal and unsystem- 
atic. In view of the frequent differences 
of opinion related to the use of hearing 
aids by young children, the collection 
of some systematic information was 
undertaken in the present study. 


Method 


The information in this report was 
based on a questionnaire mailed to 
families of deaf children previously en- 
rolled in the Correspondence Course 
of John Tracy Clinic. The question- 
naire was developed to elicit informa- 
tion on items of interest in connection 
with the use of hearing aids by young 
children. The areas covered were: de- 
gree of deafness, agency or individual 
first determining hearing loss and rec- 
ommending hearing aid, types of test 
used, child’s reaction to the hearing 
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aid, actual use made of the hearing aid, 
areas of satisfaction and dissatisfaction, 
and purchase of additional hearing aids. 

The questionnaire was pretested on 
a sample of 250 families and modified 
on the basis of this experience. The final 
three-page form was mailed to a sample 
of 5000 families. The completed forms 
were returned by 1515 families by the 
time the analysis was undertaken. By 
modern polling standards (J, p. 95, 
chap. 11) this was considered a good 
return for this kind of mail question- 
naire, particularly since the mailing list 
was not up-to-date and stamped return 
envelopes were not included. 


Some portions of the questionnaire 
could be answered by ‘yes’ or ‘no’: 


Did you receive guidance and advice as 
to the techniques to be used in helping 
your child accept his first hearing aid? 
yes no 

Does your child wear his hearing aid to 
school? yes no 

Are you satisfied with your child’s pres- 
ent hearing aid? yes no 











Other questions required a simple re- 


ply: 
How long did it take your child to learn 
to put the earmold in his ear? 





How many hearing aids have you pur- 
chased? 
What level or grade in school does your 
child attend? 
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Others required the completion of a 
check list: 


What would be the things you would 
most like to see improved on hearing aids? 











Check: __receiver, earmold, 
cord, microphone, bat- 
tery, size, carrying case. 





Other questions were open ended: 


What was your child’s reaction to using 

his first hearing aid? 

What is the school’s attitude about the use 

of hearing aids? 
The material was coded according to 
a plan designed in connection with the 
questionnaire. The coded responses 
were punched on IBM cards and the 
data processed on the IBM 709 com- 
puter of the Western Data Processing 
Center, using a 709 program for ques- 
tionnaire data processing designed by 
Dr. John R. B. Whittlesey. 


Subjects. The sample consisted of 
1515 children, 810 (53.5%) boys and 
705 (46.5%) girls, a division consistent 
with previous normative studies (J). 
Average hearing losses were divided 
into three groups: 30 to 70 db (18.6%); 
70 to 90 db (43.7%); and greater than 
90 db (33.3%). The degree of loss was 
not determined in 4.4% of the sample. 
The sex ratios were similar at all three 
levels of hearing impairment (Table 1). 

On the basis of the father’s occupa- 
tion, the families came from a variety of 
socioeconomic backgrounds: 30.1% of 
the fathers were in the professions, 


TaBLE 1. Distribution of boys and girls relative 
to hearing loss level (total N = 1515). 











Hearing Loss N Boys Girls 
30 to 70 db 282 56.4% 43.6% 
70 to 90 db 661 52.2% 47.8% 
Over 90 db 505 52.9% 47.1% 
Not classified 67 
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TaB.Le 2. Distribution of children relative to age 
when first hearing aid was used. Of the total 
sample (N =1515), 20.4% had not used a hear- 
ing aid. 











Years N % Total 
Under 2.5 208 13.7% 
2.5 to 3.4 242 16.0% 
3.5 to 4.4 248 16.4% 
4.5 to 5.4 229 15.1% 
5.5 to 6.4 123 8.1% 
6.5 to 7.4 63 4.2% 
7.5 to 8.4 42 2.8% 
8.5 to 9.4 51 3.4% 








22.3% were in the so-called white- 
collar and sales positions, 26.1% were 
skilled laborers and journeymen, 10.2% 
were semiskilled laborers, and the bal- 
ance were other kinds of workers or 
were unemployed. 

The mailing list for the questionnaire 
was comprised of the names of families 
of the 5000 most recent enrollees of the 
Correspondence Course. The geograph- 
ical distribution of the 1515 respondents 
was widespread: 262 replies were re- 
ceived from eight New England states; 
165 from five eastern states; 275 from 
seven north central states; 151 from 
11 southern states; 225 from five south- 
western states; 225 from seven central 
states; 121 from six northwestern states; 
and 91 from Canada and Hawaii. 

It is difficult here, as with most other 
mail questionnaires, to estimate the re- 
liability of the responses received. It is 
possible that this sample is biased in 
favor of families with special interest 
in their child and his hearing problem 
as indicated (a) by enrolling in the 
Correspondence Course, and (b) by 
taking time to participate in this study. 
Therefore the results must be assessed 
with the possibility of such bias in mind. 
If the purchase of a hearing aid by the 
parent can be taken as an indication of 
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TaBLE 3. Distribution of children in percentages relative to age in years when first hearing aid was 
purchased and age at time of questionnaire (N = 1490). 








Age First Aid Purchased (Years) 


Present Age (Years) 


Under 4.5 4.5t06.4 6.5t09.4 9.5 and Over 





Under 2.5 35.3% 18.7% 13.6% 6.0% 
2.5 to 3.4 21.8% 24.6% 14.3% 10.8% 
3.5 to 4.4 6.0% 22.6% 19.4% 12.7% 
4.5 to 5.4 1.5% 10.9% 21.3% 16.8% 
5.5 to 6.4 6% 11.4% 12.2% 
6.5 to 7.4 4.6% 7.4% 
7.5 to 8.4 38% 2.2% 5.5% 
8.5 to 9.4 8% 38% 8.4% 
No Aid or No Answer to Question 34.6% 22.0% 13.2% 20.2 








a parent’s interest in his child, then the 
sample of respondents may not be 
overly biased with eager parents for the 
results show that 20.5% had not pur- 
chased any hearing aid. 


Responses to Questionnaire 


Age at Time of First Hearing Aid. 
A majority of the 1515 children 
(79.5%) had been fitted with at least 
one hearing aid. Of these, 22.7% had 
purchased two aids, 8.9% had pur- 
chased three aids, 2.3% had pur- 
chased four aids, 5.4% had purchased 
five or more aids, while 40.2% were 
still wearing the first aid purchased. 

It was found that for 61% of the 
sample, the first hearing aid was used 
at an age under 5.5 years (Table 2). A 


Taste 4. Distribution of children in percentages 


comparison of current age of child 
with age when the first hearing aid was 
purchased (Table 3) shows a trend 
toward placing hearing aids on younger 
children. 


Tests and Recommendations. A tabu- 
lation of agencies or individuals first 
determining the degree of loss and first 
recommending the purchase of a hear- 
ing aid (Table 4) indicates that both 
doctors and audiologists play a more 
prominent role in the determination of 
loss than they do in the recommenda- 
tion for purchase of an aid while for 
teachers of the deaf the reverse is true. 
This seems reasonable in view of the 
kind of work each does with the child. 

A comparison of the age of the child 
and the profession involved in the first 


according to agents making the initial determination 


of hearing loss and recommendation for hearing aid (N = 1515). 











Agents Determination Recommendation 
of Loss of Aid 
Otologists 24.2% 366 11.8% 179 
Other Doctors of Medicine 15.4% 234 7.9% 119 
Audiologists 48.1% 730 35.5% 538 
Teachers of Deaf 18.0% 273 
Hearing Aid Dealers 1.7% 26 3.0% 46 
All Others 6.6% 101 2.2% 34 
Not Reported 4.0% 58 21.6% 326 
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Age When First Hearing hid Purchased 
Ficure 1. Comparison of medical and audio- 
logic determination of hearing loss at various 
age levels. 
recommendation of a hearing aid (Fig- 
ure 1) indicates that more of the early 
evaluations are made by audiologists 
than by medical people and that more 
of the later evaluations are made by 
medical people than by audiologists. 


Method of Testing. Regardless of 
who first determined degree of loss, a 
variety of test instruments was em- 
ployed: an audiometer was used in 
81.7% of the tests, gross sounds in 
51.8%, tuning forks in 44.6%, and the 
galvanic skin response test in 33%. Only 
67.1% of the parents responding to the 
questionnaire felt that an accurate de- 
Taste 5. Percentages of children reporting initial 
satisfaction with aid for each of eight age levels 
of first purchase of hearing aid. A total of 309 


children had not purchased a hearing aid or did 
not answer this question. 











Age First Purchase N Initial Satisfaction 
Under 2.5 208 57.7% 
2.5 to 3.4 242 47.5% 
3.5 to 4.4 248 48.8% 
4.5 to 5.4 229 47.2% 
5.5 to 6.4 123 46.3% 
6.5 to 7.4 63 46.0% 
7.5 to 8.4 42 47.6% 
8.5 to 9.4 51 60.8% 
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termination of the hearing loss had 
been made prior to the purchase of the 
first hearing aid. 


Satisfaction with First Hearing Aid. 
The child’s reaction to and acceptance 
of his first hearing aid was reported as 
good or satisfactory by 43.9% of the 
parents. The other parents who had 
purchased aids reported indifferent, 
poor, or slow acceptance. 

Age appears to be related to the 
child’s early acceptance of his first 
hearing aid, but the relationship is not 
a linear one. More children (Table 5) 
were reported to have a good initial 
reaction in the youngest age group 
(under 2.5 years) and in the oldest 
group (8.5 to 9.4 years) than in any 
of the other age groups. A possible 
explanation is that at the early ages 
the child is more tractable than at the 
‘middle ages’ and therefore accepts 
training without difficulty and with 
little question, while at the later stages 
he accepts more readily because he can 
be reasoned with and is more likely 
to see the need or value of using the aid. 
Similar curvilinear relationships have 
been reported in other training areas 
(4). 

If the child did not readily accept 


his first hearing aid or if he rejected~ 


it, parents reported the following com- 
plaints or explanations: aid uncomfort- 
able (8.3%); did not see the value of 
the aid (7.7%); did not like the ear- 
mold (4.3%); a mew _ experience 
(3.5%); no previous experience wit 


sound (3.2%). zoe 


Among the dimensions of acceptance 
studied, it was found (Table 6) that 
39.6% of the children were willing to 
put on the aid and use it from the start; 
27.5% were able to adjust the volume 


| 
| 


j 
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TasBLeE 6. Percentages of children acquiring certain attitudes and skills associated with use of hearing 
aid (N = 1515) and time at which attitudes and skills were acquired. 











Willingness Ability to Ability to 
Time to Put on Adjust Volume Insert Earmold 
Aid and Use Unassisted Unassisted 
From Start 39.6% 27.5% 20.8% 
Under 5 Months 12.2% 13.9% 16.38% 
5 to 12 Months 5.2% 6.4% 7.5% 
12 to 24 Months 8.5% 7.7% 8.1% 
Over 2 Years 0.0% 5.7% 5.8% 
Not yet Learned 7.8% 10.6% 14.5% 
Not Classified 26.7% 28.2% 27.0% 








without assistance; and 20.8% were 
able to insert their earmold without 
assistance from the start. Information 
on percentages of children requiring 
time is given in Table 6. 


Use of Hearing Aids. Parents re- 
ported that 45.8% of the children make 
maximum use of the aid, that is, wear 
it at all times as though it were an 
article of clothing. Of the 68.6% of 
the sample attending school (preschool 
through the sixth grade), 48.8% wore 
their aids during the entire school day. 
An additional 14.0% wore their aids 
during part of the school day. Nearly 
all of the schools involved (92.3%) 
encouraged the use of aids. 

With reference to leisure time, par- 
ents reported that the hearing aid was 
most likely to be worn during the fol- 
lowing activities or periods: watching 
TV (23.4%); in the evening (18.4%); 
after school (16.6%); and before school 
(15.7%). They also reported that it 
was least likely to be worn in play, 
particularly in rough play (19.3%). 


Areas of Satisfaction and Dissatisfac- 
tion. Parents in this sample reported 
that they were satisfied with the per- 
formance of their child’s current hear- 
ing aid in 52.6% of the cases. Those 


who were dissatisfied listed the follow- 
ing complaints: the aids did not have 
enough power (7.9%); they did not 
help the child (5.0%); they were too 
large (4.4%); they caused distortion 
(3.5%); they were too noisy (3.5%); 
they were not sturdy enough (1.9%); 
the dials moved too easily (.7%); the 
child needed two aids (2.6%); the serv- 
ice was poor (.6%). 

TasBLE 7. Percentage of each age group using 
aid all day and percentage of each age group re- 


porting satisfaction with present aid. A total of 
309 children had not purchased a hearing aid or 


-did not answer this question. 











Age First Use All Satisfied 

Purchase N Day with Pres- 
(years) ent Aid 
Under 2.5 208 68.3% 64.4% 
2.5 to 3.4 242 57.9% 56.2% 
3.5 to 4.4 248 60.9% 59.7% 
4.5 to 5.4 229 62.4% 59.4% 
5.5 to 6.4 123 50.4% 56.9% 
6.5 to 7.4 63 42.9% 50.8% 
7.5 to 8.4 42 26.2% 38.1% 
8.5 to 9.4 51 19.6% 43.1% 








Without reference to their current 
satisfaction or dissatisfaction, parents 
reported that they would like to see the 
following items improved on hearing 
aids for their children: cords (34.7%); 
size (21.7%); earmolds (21.0%); carry- 
ing cases (15.5%); batteries (13.3%); 
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receivers (11.1%); microphones 
(9.0% ); controls (4.5%); cost (2.0%). 

While there is a curvilinear relation- 
ship between acceptance of the first 
hearing aid and the age when the first 
aid was purchased, the other indicators 
of satisfaction and use of hearing aids 
show a linear relation to age (Table 7). 


TABLE 8. Percentages of children able to use 
hearing aid (adjust volume, insert earmold) inde- 
pendently from time of purchase of first hearing 
aid for various age levels of first purchase. A 
total of 309 children had not purchased a hearing 
aid or did not answer this question. 








Earmold 





Purchase Age N Volume 

Adjustment Insertion 
Under 2.5 years 208 11.1% 7.7% 
2.5 to 3.4 242 21.5% 13.6% 
3.5 to 4.4 248 29.4% 14.5% 
4.5 to 5.4 229 40.2% 31.4% 
5.5 to 6.4 123 45.5% 36.6% 
6.5 to 7.4 63 58.7% 54.0% 
7.5 to 8.4 42 76.2% 71.4% 
8.5 to 9.4 51 80.4% 76.5% 








There is more satisfaction when the aid 
is purchased early than when it is pur- 
chased later, but, as might be expected, 
there is difficulty in mastering the use 
of it, as indicated by the percentage of 
children who were able to adjust their 
volume controls and to insert their own 
molds (Table 8). Children whose aids 
were purchased early take longer to 
master the independent use of the aid 
than do those beginning at an older age. 

Among those parents who reported 
satisfactory use of their child’s first 
hearing aid, 79% reported that they 
had received advice and guidance, while 
19.7% reported that they had received 
no help or guidance. 

Another factor influencing the initial 
satisfaction with the hearing aid was 
the degree of hearing loss. For the 


group of children with 30-70 db loss, 
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46.1% of the parents reported good 
initial satisfaction; in the 70-90 db loss 
group, 46.6% reported good initial satis- 
faction; while in the greater than 90 
db loss group only 32.3% reported 
initial satisfaction. 

It might be speculated that the eco- 
nomic position of the family would be 
a factor related to the time of the first 
purchase of a hearing aid. It was found 
that 16.5% of the families having fathers 
in major or minor professional groups 
had not purchased aids; 16.9% of the 
white collar and sales worker group 
had not; 22.5% of the skilled laborer 
and journeymen group had not; and 
19.2% of the semiskilled group had not. 
These figures would suggest that eco- 
nomic factors, at least insofar as re- 
vealed by the occupation of the father, 
do not determine whether or not a 
hearing aid is purchased. 


TasLe 9. Percentages of children at each of three 
hearing loss levels (exclusive of 67 children not 
so classified) and age at which first hearing aid 
was purchased. 








Age First Aid Hearing Loss (db) 
Purchased (Years) 30 to 70 70 to 90 Over 90 











(N=282) (N=661) (N=505) 
Under 2.5 11.0% 16.8% 12.1% 
2.5 to 3.4 15.2% 20.3% 11.3% 
3.5 to 4.4 20.2% 18.9% 11.7% 
4.5 to 23 a ree ae 
5.5 to 6. -O%o -0/0 .3Fo 
6.5 to 7.4 4.6% 3.9% 4.4% 
7.5 to 8.4 71% 2.9% 3.8% 
8.5 to 9.4 1.1% 2.1% 6.7% 
None Purchased 12.5% 10.4% 32.8% 








The relationship of degree of deaf- 
ness to the age when the first hearing 
aid was purchased (Table 9) indicates 
that children with the most severe loss 

\ are least likely to have an aid during 


\the ages covered by this study. 
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Summary 


This report summarizes a survey of 
1515 families of young deaf children 
concerning their experiences with hear- 
ing aids. The information was obtained 
from a questionnaire mailed to parents 
who had been enrolled in the Corre- 
spondence Course of the John Tracy 
Clinic. The report describes character- 
istics of the sample population: age at 
which hearing loss was discovered, 
agency or individual determining the 
degree of loss, use of first and subse- 
quent hearing aids, and areas of satis- 


Physiology of Speech Breathing 


B® A study of the physiology of speech 
breathing is being initiated at the Hospital 
School for Severely Handicapped Children, 
University of Iowa. The project will be di- 
rected initially toward determining (a) the 
actual magnitudes of breath pressure associ- 
ated with various speech acts; (b) whether 
the breathing musculature has the role of 
(1) maintaining steady air pressure during 
speech upon which the larynx and articu- 
lators act to produce the speech signal or 
(2) initiating discrete fluctuations of such 
air pressure subglottally and intraorally dur- 
ing running speech; and (c) what systematic 
patterns of respiratory muscular activity are 
associated with changes in articulatory and 
laryngeal activity. 

The research plan calls for simultaneous re- 
cording of electromyographic potentials of 
the breathing musculature, air flow frorn the 


faction and dissatisfaction with the 
hearing aid. Certain factors are con- 
sidered in relation to the child’s sub- 
sequent satisfaction and use of the aid. 
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B> RESEARCH NEWS NOTE 


mouth, intrathoracic pressure, and speech 
signals in such a way that the dynamics of 
the phenomena may be studied in detail. 
The project will eventually lead to study of 
the speech breathing problems of children 
with cerebral palsy. 

The main researcher will have direct con- 
sulation with members of the Department of 
Speech Pathology and Audiology, Depart- 
ment of Physiology, and Departments of 
Anesthesia, Internal Medicine, Medical Elec- 
tronics, and Pediatrics, of the College of 
Medicine, University of Iowa. This investi- 
gation is supported by PHS research grant 
B-2662 from the National Institute of Neuro- 
logical Diseases and Blindness, Public Health 
Service. 


James C. Hardy, M.A. 
Main Researcher of Project 
University of Iowa, Iowa City 
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A Laminagraphic Study of Vocal Pitch 


HARRY HOLLIEN 


JAMES F. CURTIS 


The technique of laminagraphic x ray 
is by no means new as a medical diag- 
nostic tool but it has had only limited 
use in laryngeal research. Prior to 1954, 
when the present study was begun, only 
Griesman (1) had used the procedure 
for studying laryngeal vocal phe- 
nomena. He presented and discussed 
laminagrams made on several singers 
while they were producing various 
vocal pitches. His work showed that 
the laminagraphic approach was un- 
doubtedly useful in the study of la- 
ryngeal phenomena. In addition, his re- 
sults suggested that there may be a 
systematic trend in the cross-sectional 
size of the vocal folds which would 
correlate with certain of the acoustical 
parameters of voice, specifically the 
fundamental frequency of phonation. 
Recently a number of laminagraphic 
investigations of laryngeal phenomena 
have been reported. Some of these, such 
as the studies of Sonninen and Vaheri 
(6) and Zaliouk and Izkovitch (8) 





Harry Hollien (Ph.D., University of Iowa, 
1955) is Assistant Professor of Logopedics, 
University of Wichita and Institute of Logo- 
pedics. James F. Curtis (Ph.D., University 
of Iowa, 1942) is Professor and Head, De- 
partment of Speech Pathology and Audi- 
ology, and Director, Speech Laboratory, 
University of Iowa. This article is an adap- 
tation of a paper read at the International 
Voice Conference, Chicago, 1957, and a paper 
presented at the 1956 convention yt the 
American Speech and Hearing Association, 
Chicago. 
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have been primarily concerned with 
clinical manifestations. Luchsinger and 
Dubois (4) have applied laminagraphic 
procedures to the investigation of a 
single subject with an extraordinary 
pitch range in excess of five octaves, 
and Moolenaar-Bijl (5) and van den 
Berg (7) have used the technique in 
fundamental investigations of laryngeal 
physiology. 

The purpose of the present study was 
to investigate trends relating changes 
in the cross-sectional dimensions of the 
vocal folds with variations in vocal 
pitch. These comparisons were obtained 
from measurements made on coronal 
views of the larynx seen on lamina- 
graphic x rays. 


Procedure 


Subjects. Four groups of subjects 
were selected: Group LM, six very low 
pitched male voices; Group HM, six 
very high pitched male voices; Group 
LF, six very low pitched female voices; 
Group HF, six very high pitched fe- 
male voices. These subjects were 
selected from a group of 254 volunteers 
by procedures that have been described 
in detail elsewhere (2). 


Equipment. Equipment included a 
Keleket Selecto-plane laminagraphic 
x-ray unit. This system was operated 
with a Multicron 200 ma generator, a 
Dynamax Number 40 x-ray tube and a 


December 1960 
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Ficure 1. Laminagraphic x-ray equipment 
with subject in position for making lamina- 
grams of the larynx. The supporting cork 
blocks and sponge rubber padding are not 
shown in this illustration. The electronic 
unit in the foreground is the voice level 
monitor. 


0.3 mm focal spot. The travel distance 
of the X-ray unit was 20 in. and the 


exposure time 1.5 sec. Current and volt-‘ 


age settings were 25 ma and 68 to 82 
kv, respectively. 


Subject Placement. Figure 1 shows a 
subject in position on the laminagraphic 
table. The use of sponge rubber pad- 
ding, cork blocks, and a head immobi- 
lizer allowed for the subject’s comfert, 
prevented him from moving, and en- 
abled the investigator to position him 
as required. Each subject was placed 
on the table so that the central ray 
would pass through his larynx at the 
level of the vocal folds. The laryngeal 
prominence and lateral x rays were 


used to locate vocal fold level, and 
adjustments were made with reference 
to a system of markers intrinsic to the 
equipment. 

One of the problems associated with 
this research was that subjects were re- 
quired to maintain a supine position 
while the x-ray films were made. Un- 
fortunately the distorting effects on the 
larynx due to the pull of gravity are 
not known and it was not possible to 
compare these results with other lami- 
nagraphic results obtained with the 
subject erect. However, it seems un- 
likely that this posture would produce 
a differential effect of sufficient magni- 
tude to invalidate comparisons among 
group means. It is likewise improbable 
that the effects of this possible source 
of error would be differential from one 
pitch to another for a given subject. 
Nevertheless the possibility of distor- 
tion due to the supine posture should 
be recognized and the results should be 
interpreted conservatively. 


Selection and Control of Vocal 
Pitches. Each subject was required to 
phonate at four pitch levels. Three 
were chosen to represent a distribution 
of levels within the subject’s normal 
register and the fourth was produced in 
falsetto. These pitches were specified in 
relation to the subject’s total pitch 
range and were located as proportions 
above his lowest sustainable tone. The 
10, 25, 50, and 85% points to the 
nearest semitone were chosen. In addi- 
tion, each subject was x rayed under a 
condition of no phonation. 

Control of the fundamental fre- 
quency of phonation was obtained by 
use of a reference tone provided by an 
ordinary chromatic pitch pipe with a 
range from F, to F,. Because of the 
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limited range of this device, training 
periods were held until each subject 
was able to produce the selected pitches 
in the proper octave. 


Control of Vocal Intensity. Since the 
size, shape, and action of the vocal folds 
may vary with changes in the intensity 
of the tone being produced, this vari- 
able was controlled. Duzing the train- 


e . . e / 
ing sessions each subject was required, 
to produce three samples of each of his _ 


four pitches at what he felt were ‘com- 
fortable’ intensity levels. Sound level 
readings of these attempts were ob- 
tained from a General Radio type 759B 
sound level meter used with its flat 
weighting network and with its micro- 
phone placed 10 in. from the subject’s 
lips. The levels from each of these 
trials were recorded and the mean value 
of all readings at each pitch calculated 
by groups, that is, the mean was com- 
puted for the readings obtained on the 
low (10%) pitch for the low male 
group, then for the medium (25%) 
pitch for the same group, and so on. 
These means were used as the intensity 
level settings for that particular group 
at each specific pitch. Thus levels which 
were both comfortable and constant 
for all subjects were predetermined. 


A voice level monitor with two neon 
reference lights was used to provide 
cues to the subject, indicating when he 
was maintaining the required vocal in- 
tensity. This unit was calibrated in 
such a way that when the subject 
reached a vocal intensity within 2 db 
of that required, one of the neon lights 
would glow; if he exceeded this level 
by 2 db, both lights would glow. Dur- 
ing the experimental runs, the micro- 
phone of the voice level monitor was 
placed 10 in. from the subject’s lips as 
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he lay on the laminagraphic table. One 
set of reference bulbs was placed so he 
could easily see them from the supine 
position and another pair was located 
where the experimenters could observe 
them. 


Determining the Laminagraphic 
Plane. It was considered desirable to 


) make measurements in a plane which 


passed approximately through the mid- 
point of the anteroposterior length of 
the vocal folds. In order to do this, a 
series of three or more test exposures 
was taken as each subject phonated his 
medium (25%) pitch. One of these 
X rays was taken at a plane calculated 
from lateral x-ray films to pass approxi- 
mately through the midpoint. Two 
other exposures were taken—one an- 
terior and one posterior to the first. For 
the women the distances between these 
planes was 1 cm and for the men, 2 
cm. When the test exposures had been 
developed, they were examined and 
compared with measurements of la- 
ryngeal dimensions (obtained from the 
lateral x rays) and with the external 
dimensions of the subject’s neck. Based 
on these comparisons, the equipment 
was adjusted to the settings that would 
yield the best definition and yet be 
closest to the anteroposterior mid- 
point. The planes of the experimental 
laminagrams varied between 11 and 15 
cm from the casette. 


Conditions of Exposure. Once the 
subject was in position and the proper 
settings determined, laminagrams were 
made of each of the five experimental 
conditions. First the reference tone was 
provided and the frequency of the 
subject’s sustained phonation evaluated. 
Then observations were made to deter- 
mine whether or not he was producing 
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Figure 2. An example of a coronal cross- 
section laminagram showing the laryngeal 
area and the vocal folds during phonation. 


the pitch at the proper vocal intensity. 
When both conditions were met, the 
subject was requested to inhale and re- 
peat the tone. If both the pitch and in- 
tensity were satisfactory, a laminagram 
was immediately exposed. Finally, after 
the exposure had been made, both the 
pitch and intensity of the subject’s 
vocal output were again checked. If 
either was not satisfactory, the lamina- 
gram was retaken. Figure 2 is an il- 
lustration of a laminagram. 


Measurements. Measurements were 
attempted only on the films made dur- 
ing phonation. For the nonphonation 
or rest condition, measurements were 
not possible because the vocal folds 
were usually not distinguishable from 
the lateral laryngeal walls. That is, they 
apparently receded laterally, or turned 
upward against these walls, to such a 


degree that a distinct outline could 
rarely be seen. 

The measurement process will be 
described with reference to Figure 3. 
The mesial borders of the vocal folds 
and laryngeal tract were carefully out- 
lined and are represented by lines A-B 
and A’-B’. To measure this cross-sec- 
tional area requires that lateral bounda- 
ries for the vocal folds be established. 
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Ficure 3. Tracing of a laminagram of the 
vocal folds showing the reference lines used 
for the cross-sectional area measurements of 
the mesial projection of the vocal folds. 
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This presented a problem since there 
are no sharply defined lateral limits for 
these soft tissue folds and the lamina- 
grams did not show anatomical land- 
marks useful in determining natural 
boundaries. For this reason it was nec- 
essary to make use of boundaries 
which, although admittedly arbitrary, 
were nevertheless comparable and con- 
sistent for all films and allowed for 
valid comparisons. These boundaries 
were two sets of parallel lines, C-D and 
C’-D’, E-F and E’-F’, which were 
drawn with reference to two sets of 
points characteristically defined in the 
wall contours: Y and Y’, the points at 
which, a short distance below the vocal 
folds, the lateral dimension of the tra- 
chea showed a definite maximum; X 
and X’, the points above this level ap- 
proximately at the base of the vocal 
folds where the walls, after tapering 
gradually toward the midline, inflected 
sharply inward. To insure that the 
lines drawn through X and X’, and Y 
and Y’ were parallel, the midline of the 
vocal tract (Z-Z’) was determined as 
the line connecting a point midway be- 
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tween the vocal folds and a point mid- 
way between points Y and Y’. Lines 
C-D and C’-D’, lateral borders (to be 
referred to as the laryngeal wall refer- 
ence line), were then drawn parallel to 
this midline through points X and X’. 
Lines E-F and E’-F’, the second set of 
lateral borders (to be referred to as the 
tracheal wall reference line), also were 
drawn parallel to the midline but 
through points Y and Y’. 

One further problem remained in de- 
fining boundaries for the cross-sectional 
area measurements. Since the outline of 
the ventricle often did not extend far 
enough Jaterally to intersect E-F and 
E’-F’, it was necessary to close the area 
bounded laterally by the tracheal wall 
reference line. This was done by ex- 
tending the outlines of the superior 
vocal fold surfaces until they inter- 
sected with these reference lines at G 
and G’. 

Three sets of measurements were 
made. The first was the small area of 
the vocal folds as they extend mesially 
into the larynx from the laryngeal wall 
reference line. The second measurement 


Taste 1. Analysis of variance of the area of vocal fold projection from the laryngeal wall 


reference line. 











Source df ms F F 10s Fon 
Between Subjects 23 
Group (G) 3 2303.68 44.67 3.10 4.94 
Error 20 51.57 
Within Subjects 72 
Frequencies (F) 3 916.91 64.34 2.67 4.13 
FG 9 47.25 3.32 2.10 2.82 
Error « 60 14.25 


Total 95 
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included the vocal folds plus the lat- 
eral supporting walls bounded by the 
tracheal wall reference line. The third 
measure was of mean thickness of the 
portion of the vocal folds that projects 
mesially from the laryngeal wall refer- 
ence line. This measurement was ob- 
‘tained by dividing each area meas- 
urement by the corresponding lateral 
extent of the fold. Each of these three 
measurements was made for both vocal 
folds and the measurements for the two 
folds were averaged to give a single 
value; that is, there were three single 
measurements for each subject. 

No corrections for x-ray enlarge- 
ments were made for the following 
reasons: (a) Because the distance from 
the x-ray plane to the casette was so 
small and varied so little (from 11 to 
15 cm) there probably would be no 
significant differences in enlargement 
from subject to subject. (b) Since the 
central ray was directed through the 
midline of the larynx at the level of the 
laryngeal prominence, and since all of 
the areas and distances to be measured 
were quite small and lay very close to 
this central ray (w here the angle of 
divergence is least), error due to dis- 
tortion would be correspondingly small. 


Results and Discussion 


Area. Table 1 presents the results of 
the analysis of variance to determine the 
statistical significance of trends in cross- 
sectional vocal fold area measurements 
based on the laryngeal wall reference 
line. (Since the trends based on the 
tracheal wall reference line were very 
similar, there seemed little justification 
in reporting both sets of data.) In order 
to provide for comparisons among the 
fundamental frequencies within pitch 


















_ 

™ i j | 
| Sew ai 

Ree Se 

ee 


' | 
| | 
_|_ LOW FEMALE) 
| 
| 








CROSS-SECTIONAL AREA 


boas 
| HIGH FEMALE | 
ee 
H | ae | 














z 
ae 











| 
| 
| 
| 
i 





| | j | | i 

L L oe eae a a 

10 20 30 40 8 6 70 80 90 
RELATIVE FREQUENCY LEVEL IN PER CENT OF RANGE 





Ficure 4. Variation in area of vocal fold 
cross section with change in fundamental 
frequency of phonation. The area values 
plotted (in square millimeters) are group 
means of the measurements of the vocal fold 
cross section, mesial to the laryngeal wall 
reference line. Relative frequency level values 
show the location of tones as proportions of 
the total phonational range above the sub- 
ject’s lowest sustainable vocal frequency. 


groups and also to evaluate the differ- 
ences among groups, a Type I mixed 
design was used (3). It may be seen in 
Table 1 that the F associated with the 
pitch group differences is substantially 
larger than that required for signifi- 
cance at the 1% level. Hence, the _over- 


all trend_showing that_groups with 


lower pitch 1_ levels have larger vocal 
fold cross section may be regarded as 
highly significant. The F aeneaned 
with change in fundamental frequency 
of phonation is also very large so that 
the over-all trend toward a smaller 
cross-sectional area, as subjects vary 
fundamental frequency of phonation 
upward, may also be considered highly 
significant. 

Evaluation of the significant value of 
F obtained for the frequency-by-group 
interaction may be made by reference 
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TABLE 2. 
between these means. 
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Means of the measures of vocal fold cross-sectional area and the critical difference 











Group Fundamental Frequency of Voice Means* 
Low (10%) Medium (25%) High (50%) Falsetto (85%) 
Low Male 55 43 38 31 42 
High Male 37 30 27 26 30 
Low Female 31 27 22 18 24 
High Female 25 19 16 14 18 
Mean; 37 30 26 22 








* The critical difference for evaluating differences among these means is 4.34. 





{ The critical difference for evaluating differences among these means is 2.14. 


to Figure 4. Taken at face value this 
interaction would indicate that the 
trend lines shown in Figure 4 are signif- 
icantly different from one group to an- 
other. However, as shown by the graph, 
these trends are actually so similar that 
the small differences which result in this 
significant interaction seem relatively 
unimportant by the comparison to the 
general trend, namely, that_for_all 
groups there is a decrease in vocal fold 
cross section n_as_vocal itch is raised. 


The significant i interaction does indicate 
ie 





Taste 3. Analysis of variance of measures of 


that_the_exact functional relationship 
between these two variables may be 
somewhat different from group to 
group. 

Table 2 presents the group means of 
the measures of vocal fold cross-sec- 
tional area. It is organized so that the 
group values are presented in the rows, 
with row five containing the means of 
all groups for each pitch. In the col- 
umns are values by vocal pitch with 
column five containing the over-all 
means of each group for all pitches. 


the mean thickness of the vocal folds. 











Source df ms F F os F 1 
Between Subjects 23 
Groups (G) 3 i8.01 14.97 3.10 4.94 
Error 20 1.20 
Within Subjects fa 
Frequencies (F) 3 51.06 124.54 2.67 4.13 
FG 9 93 Lat 2.10 2.82 
Erroryy) 60 Al 
Total 95 
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Several comparisons may be made from 
Table 1. First, by observing the means 
within each column, differences from 
group to group may be seen. It is ap- 
parent that for all four of the fre- 
quencies there. is systematic variation 
among the pitch groups in the direction 
of a decreasing mean size-of the “Cross- 
sectional area corresponding to an in- 
crease in the pitch level of the group. 
Although some overlapping was found 
for individuals within groups, there 
were no reversals among the means and 
they are consistent in direction 
throughout the entire table. Th hese 


findings (2) that for indices of general 
laryngeal dimensions, the size differ- 
ences between the high male and low 
female pitch groups tend to be dispro- 
portionately large compared to their 
differences in pitch level. Rather, the 
difference between the high male and 
low female groups for the cross-sec- 
tional dimension tends to be somewhat 
smaller than that between the two 
male or the two female groups. Singe 
the high male and low female groups 
show a relatively | small pitch difference, 
this comparison suggests that vocal fold 
cross séction may be more_ closely _ re- 


lated to pitch. level differences than are 
some of the other indices of laryngeal 
size. In any event, it may be seen that 
if the critical difference of 4.34 is ap- 
plied to these group means, all differ- 
ences between means are statistically 


significant. 

Another important comparison that 
may be made concerns the change 
within the group_in-the-cross-sectional 
size of the vocal folds with the rise in 
fundamental frequency of phonation. 
If the means within rows are compared, 


it is apparent that the cross-sectional 
areas_of the vocal 1 folds decrease with 
increasing el See This pro- 
gression is entirely regular 

Further evaluation of Figure 4 points 
up an interesting relationship that is 
not apparent in evaluating Tables 1 and 
2. That is, the rate at which cross-sec- 
tional vocal fold area decreases with 
increase in the fundamental frequency 
of phonation is apparently greatest in 
the low frequency portion of the sub- 
ject’s range and becomes _proportion- 
ately less as frequency rises. Also, the 
least change is very often seen between 
the third and fourth _pit tches. This 
fourth pitch, however, is produced in 
a different register—falsetto, and it may 
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Figure 5. Variation in mean vocal fold thick- 
ness with change in fundamental frequency 
of phonation. The measures were obtained 
by dividing the area of vocal fold cross sec- 
tion by the lateral dimension. Relative fre- 
quency level values show the location of 
tones as proportions of the total phonational 
range above the subject’s lowest sustainable 
vocal frequency. 
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Taste 4. Means of the measures of mean thickness of the vocal folds and the critical 


differences between these means. 











Groups Fundamental Frequency of Voice Means* 
Low (10%) Medium (25%) High (50%) Falsetto (85%) 
Low Male 9.2 69 5.6 5.0 6.7 
High Male 6.8 5.6 47 4.3 54 
Low Female 6.8 5.4 4.9 3.8 52 
High Female 6.8 4.9 3.7 3.1 4.6 
Meany 74 5.7 4.7 4.0 








* The critical difference for evaluating differences among these means is 0.66. 
* The critical difference for ev aluating differences among these means is 0.36. 


be possible that were this register to be 
analyzed in some detail, the resulting 
trend would be different than those 
trends noted for the natural register. 


Mean Thickness. Table 3 presents the 
statistical analysis for evaluating the 
differences among the means of the 
pitch level groups and the means of the 
different fundamental frequencies pho- 
nated by the subjects. Again the large 
values of F for both these sets of means 
established that the over-all trends may 
be considered statistically significant. 
Once more the freque ncy-by-groups 
interaction w4$ found to be significant, 
indicating possible variation from group 
to group in the true shape or slope of 
the trend lines that may be seen in 
Figure 5. As stated, however, these 
group-to-group variations would ap- 
pear to be minor when compared to 
the over-all effects that have been dem- 
onstrated. 


Table 4 is structured identically with 
Table 2 and presents the means of the 
measures of vocal fold thickness and the 
critical difference between these means. 


Trends for decreasing thickness with 


incr easing vocal frequencyare-shown 


by both the comparisons among the 


groups and the comparisons among the 
fxs uency levels within the groups. 


Trends are also shown for decreasing 
thickness with increasing pitch by both 
the comparisons among the groups and 
the comparisons among the group levels 
within frequencies. Again, the four 
trend lines seen in Figure 5 show the 
same general configuration and demon- 
strate a consistent relationship between 
thickness, as measured here, and funda- 
mental frequency of voice. 

Tables 2 and 4 and Figures 4 and 5 
show relationships between measure- 
ments of vocal fold cross section and 
fundamental frequency wherein the 
latter variable is, in all cases, expressed 
in relative terms. That is, for all of the 
comparisons made thus far, a tone is 
considered low or high in terms of its 
location within the subject’s pitch 
range. Thus, these vocally produced 
tones were treated as comparable if 
they occupied the same relative position 
in the subject’s ranges irrespective of 
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Figure 6. Area of vocal fold cross section as 
a function of absolute frequency level. The 
area values (in square millimeters) are group 
means of vocal fold cross section, mesial to 
the laryngeal wall reference line. Frequency 
level values were obtained by converting the 
fundamental frequency of phonation to semi- 
tones above a reference frequency of 16.35 
cps. 


how low or how high they actually 
were on the frequency scale. 


Another comparison may be made. 
The vocal pitches that subjects were 
required to produce may be expressed 
as frequency levels in semitones above 
zero reference level. This procedure 
was carried out and is reported graphi- 
cally in Figures 6 and 7. These data 
show a remarkably close relationship 
between cross-sectional —dimensions, 
whether area or thickness, and absolute 
frequency level. For “example, if the 
reader ignores the two points for the 
male falsetto tones in Figure 6, the re- 
maining points fall very close to one 
line. Thus a single curve could be fitted 
to all of these remaining points without 


doing any substantial violence to the 
data. This is even more striking for Fi- 
gure 7, for, if a single curve were fitted 
to the principal concentration of points, 
the only marked discrepancy from the 
general relationship therein expressed 
would be for the low pitched tone of 
the high female group. In short, these 
curves seem to show a general relation- 
ship between vocal fold thickness and 
absolute frequency that transcend dif- 
ferences in laryngeal anatomy between 
pitch groups. Moreover, this relation- 
ship seems to predominate over intersex 
differences including those of general 
laryngeal size and "wood told length. 
The hypothesis that vocal fold cross 
section is an important correlate of 
vocal frequency is strongly suggested. 
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Ficure 7. Thickness of the vocal folds as a 
function of absolute frequency level. The 
measures of thickness were obtained by 
dividing the area of vocal fold cross section 
by the lateral dimension. Frequency level 
values were obtained by converting the 
fundamental frequency of phonation to semi- 
tones above a reference frequency of 16.35 


cps. 
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Conclusions 


On the basis of the experimental find- 
ings, the following conclusions may be 
stated: (a) Individuals with low pitch 
levels exhibit larger, more massive vocal 
folds than do individuals with higher 
pitch levels. (b) As the fundamental 
frequency of an individual’s voice is 
raised, the vocal folds are reduced in 
cross-sectional area and become thinner. 
The rate of change of these dimensions 
with changes in frequency is more 
marked in the low frequency portion 
of the subject’s pitch range. (c) Cross- 
sectional dimensions of the vocal folds 
seem to be correlated with absolute 
frequency level to a greater degree than 
with relative level within the subjects’ 
pitch ranges. This tendency is evident 
no matter what an individual’s pitch 
level or laryngeal dimensions may be; 
hence it appears that one of the most 
important determiners of vocal pitch 
may be the mass or thickness of the 
vocal folds. 


Summary 


By means of frontal laminagrams, 
measurements of vocal fold cross-sec- 
tional area and thickness were obtained 
from 24 young adult subjects including 
six males with low pitched voices, six 
males with high pitched voices, six fe- 
males with low pitched voices, and six 
females with high pitched voices. Meas- 
urements were made under four con- 
ditions of phonation representing four 


371 


fundamental frequencies which sampled 
the complete pitch ranges of the sub- 
jects, including falsetto. Intensity was 
maintained constant for the different 
phonations. 

The data show significant group dif- 
ferences with low pitched subjects ex- 
hibiting larger vocal fold areas and 
thickness. Significant differences were 
also found between fundamental fre- 
quencies. The folds became less massive 
and thinner as frequency was raised 
with larger changes occurring in the 
low frequency portion of the subjects’ 
ranges. Absolute frequency appeared to 
be more closely related to cross-sec- 
tional fold dimensions than was relative 
level within the subjects’ pitch ranges. 
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Parents’ Diagnoses of Stuttering 


CHARLES !. BERLIN 


The literature suggests that parents of 
children who develop stuttering may be 
unusually intolerant of childhood non- 
fluencies (2, 7, 9, p. 126, 10, p. 157, 11), 
and would be more inclined than par- 
ents of other children to label normal 
repetitions as stuttering. In support of 
this hypothesis, Bloodstein, Jaeger, and 
Tureen (2) reported that parents of 
stutterers diagnosed more stuttering 
from samples of recorded speech than 
did parents of nonstuttering children. 
The speech samples used were record- 
ings of 12 children, six stutterers and 
six nonstutterers. These children were 
asked to tell a story about a series of 
picture cards. The samples of speech 
were played to the parents who were 
asked simply to state whether they 
thought the child in question stuttered. 


The present study was also concerned 
with parents’ diagnoses of stuttering 
from recorded speech samples. How- 
ever, the design of the speech samples 
and the conditions of presentation were 
changed and additional control subjects 
were used. Of paramount importance 
was the fact that the term stuttering 
was avoided in recruiting subjects and 
in one of the two listening conditions, 
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in view of reports' (3, 16) that asking 
judges to listen for stuttering specifi- 
cally increases the amount of stuttering 
they diagnose. 


Procedure 


Eight 100-word scripts were con- 
structed as if a child were telling a 
story about a cartoon. Each passage 
contained differing frequencies of non- 
fluency representing a continuum from 
total fluency to more than four standard 
deviations above the means for fre- 
quencies of nonfluency reported by 
several investigators (4, 5, p. 81, 10, 15). 
In the scripts, a word repetition, a 
single staller, and a single syllable rep- 
etition appeared in this fashion: 
‘Blacky is-is uh, b-biting momma.’ An 


‘additional word and syllable repetition 


would read: ‘Blacky-Blacky is-is, uh, 
b-biting m-momma.’ 

Table 1 shows frequencies of non- 
fluencies, classified by type of nonflu- 
ency, which appeared in the passages. 
These frequencies were selected to rep- 
resent an eight-point continuum of 
nonfluency, with Passage 4 at the mid- 
point containing nonfluencies of the 
several types approximately matching 
in number the means of data from the 
several studies mentioned above. 

A group of third grade elementary 
school children read these passages ver- 
batim. They were instructed to read the 


*Personal communication from Dr. Oliver 
Bloodstein, Brooklyn College. 
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Taste 1. Numbers of nonfluencies appearing in each 100-word passage of the nonfluency 
scale. Numbers for Passage 4 are based on means reported by other investigators (4, 5, p. 81, 











10, 15). 
Type of Nonfluency Reading 
2 2 4 5 6 es 8 
Repetitions 
Syllable 0 0 1 l 1 2 2 3 
Word 0 0 1 2 3 3 4 5 
Phrase 0 l 1 2 3 4 5 5 
Total Repetitions 0 1 3 5 - 9 11 13 
Total Words Involved 0 6 14 24 32 41 50 59 
Stallers 0 0 1 2 3 + 5 6 








repetitions in the script as if they were 
actually using them in their daily 
speech. In this fashion the recordings of 
these readings constituted a controlled 
and graded scale of nonfluency. 
Two graduate students with written 
transcriptions of the text at their dis- 
posal then listened to the contrived 
recordings. They located and agreed 
upon 98% of the intended nonfluencies, 
but found two nonfluencies which were 
not intended. These were deleted from 
the recordings by dubbing in a new 
phrase to take the place of the phrase in 
which the extra nonfluencies occurred. 


Validation of the Scale and Experi- 
mental Precautions. Spontaneous re- 
sponses to the cartoons, made by eight 
classmates of the children in the con- 
trived recordings, were randomized on 
tape with the contrived samples. In or- 
der to demonstrate whether or not the 
contrived readings were essentially in- 
distinguishable from spontaneous re- 
sponses, both types of recordings were 
presented to two groups of listeners: 
a group of 36 lay listeners, and a group 
of 11 sophisticated judges, the latter 
being those speech clinicians then at 
the University of Pittsburgh who held, 
or were eligible to receive, at least the 


Basic Certificate of the ASHA. The 
two groups of judges were asked to 
indicate whether a given sample seemed 
to be contrived or candid. 

The sophisticated judges identified, 
on the average, only 11.4 samples cor- 
rectly, and the nonprofessional judges 
selected, on the average, only 9.7 pas- 
sages correctly. It was thus apparent 
that the contrived samples and the can- 
did samples were not easily to be dis- 
tinguished by either group. Additional 
analysis showed that for both groups 
combined a mean of 7.2 correct judg- 
ments was made in identifying the eight 
candid samples, while a mean of 3.4 
correct judgments was made in iden- 
tifying the eight contrived samples. 
None of the judges was able to identify 
all of the contrived passages. 

It was important that the increments 
of nonfluency appearing in the passages 
be sufficiently graded to make a dis- 
criminating judgment possible. There- 
fore, the eight contrived passages were 
randomized along with two candid re- 
cordings of stuttering, one tentatively 
deemed by the experimenter to be 
slightly more severe than the other. 
This 10-anchor scale was played for 30 
lay and 10 of the previously described 
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sophisticated judges. Both sets of judges 
were asked to rate on a continuum 
from 1 to JO the amount of non- 
fluency demonstrated by each passage. 
The number 10 was used to represent 
an extremely nonfluent passage, the 
number J to represent a fluent pas- 
sage. (The sophisticated judges were 
asked also to list any speech abnor- 
malities they perceived, other than in 
fluency or in rhythm. They reported 
none.) The intended rank order of the 
passages and the obtained rank order 
were identical, indicating that there 
were sufficient differences between ad- 
jacent passages of the 10-anchor scale 
for discriminating judgments. It is 
suggested that contrived and carefully 
controlled speech samples can be used 
in lieu of candid samples for research 
purposes. 


Subjects. A total of 210 parents, rep- 
resenting 137 family units, participated 
in this study. In this group were 43 
families with at least one stutterer, 36 
with children who had no reported 
speech problems, and 58 with at least 
one child with misarticulations. Parents 
of children with misarticulations were 
used as an additional control group in 
view of opinion and reports (J, p. 3, 13, 
p- 107, 14, 18, 19, p. 92 ff.) that parents 
of children with speech problems other 
than stuttering may be more intolerant 
of speech imperfections than parents of 
normal speaking children. For the pur- 
poses of evaluating the data, the parents 
were divided into groups on the basis of 
their children’s speech: parents of stut- 
terers (N = 67), parents of children 
without speech problems (N = 57), 
and parents of children with misartic- 
ulations (N = 86). 

The ages of the subjects’ children 


ranged from 4.5 to 21 years with a 
mean age of 10 for stutterers, 10.5 for 
those with misarticulations, and 11.5 
for those without speech problems. All 
children presenting speech problems 
had received from 12 to 140 hours of 
speech therapy from ASHA certified 
members. No attempt was made to con- 
trol the sophistication of individual par- 
ents to stuttering theory or research. 
Only one type of disorder was present 
in each family group. 

The subjects were recruited from 
among parents of children in Brooklyn 
College, S. J. Tilden High School, and 
Yeshiva Flatbush speech clinics in New 
York, and the University of Pittsburgh 
and Uniontown Hospital speech clinics 
in Pennsylvania. In recruiting these par- 
ents, special care was taken to talk about 
the study in terms of development of 
communication problems rather than in 
terms of stuttering. This care to avoid 
mention of stuttering was carried 
through into the introductory remarks 
and instructions for Condition 1. 

‘The parents were roughly matched 
for socioeconomic status through the 
use of the Warner technique (17, p. 
157). On the Warner Scale an Index of 
Social Status Characteristics of from 
38 to 50 indicates a middle class social 
status. The parents tested presented the 
following mean social status scores: par- 
ents of stuttering children, 46; parents 
of normal speaking children, 41; and 
parents of children with articulatory 
errors, 40. 


/ Conditions of Presentation. The scale 
was presented once under Condition 1 
and then under Condition 2. The differ- 
ence between the two conditions was 
in the instructions: Condition 1 in- 
structions, ‘If this were your child, and 








rt Db S42aSs ADO DD S| BS" cr. 


a fF 6 6©®*=4 


I 


FC, 


\ 


CO et ili DO AO KL 


ie oe) Oe 























Parents’ Diagnoses of Stuttering: Berlin 375 


this were generally the way he spoke 
between the ages of two and five years, 
would you be concerned with his 
speech? If so, tell us exactly what this 
child did to cause you this concern’; 
Condition 2 instructions, ‘If this were 
your child, and this were generally the 
way he spoke between the ages of two 
and five, woul. you say that he stut- 
tered?’ In the second condition the 
parents were asked to simply say ‘yes’ 
or ‘no.’ Papers were collected between 
conditions to prevent answers on one 
condition from affecting the answers 
on the other. 


Results 


The Effect of Condition 1 and Con- 
dition 2 on the Diagnosis of Stuttering. 
“Asking parents (Condition 2) to decide 
between ‘stuttering’ and ‘no stuttering’ 
elicited significantly more diagnoses of 
stuttering than asking them (Condition 
1) to list exactly what the child had 
done to cause them concern. Mothers 
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Ficure 1. Ideal curve describing tolerance 


of parents of stutterers and parents of 
normal speaking children for the non- 
fluencies of childhood as extrapolated from 
Johnson’s (11) findings. 


of normal speaking children were the 
only subgroup to be unaffected signif- 
icantly by the change in instructions 
“(Table 2). 

Within either condition the mean 
number of diagnoses of stuttering 
(Table 2) remains approximately the 
same from one group of parents to an- 
other. Without statistical tests the state- 


Taste 2. Means and results of ¢ tests for evaluating mean differences between two conditions 
(C1 and C2) with respect to numbers of diagnoses of stuttering by three groups: parents of 
stutterers (A); parents of children with misarticulations (B); and parents of children with no 


speech problems (C). 











Parent Group 
A B Oy 
M t M t M t 
Gt pT 2.91 5 
Both 5.90* 5.91* 3.3§* 
C2 4.75 4.50 4.14 
Cl 2.66 2.52 3.29 
Fathers 4.15* 3.42* 3.67* 
C2 4.89 4.00 4.61 
Cl 2.82 3.15 3.19 
Mothers 4.06* 4.84* 1.84 
G2 4.63 481 3.86 








*Significant at 1% level. 
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Figure 2. Comparison of tolerance of parents 
of stutterers, parents of children with mis- 
articulations, and parents of normal speaking 
children for childhood nonfluencies, Con- 
dition 1. 
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Ficure 3. Comparison of tolerance of parents 
of stutterers, parents of children with mis- 
articulations, and parents of normal speaking 
children for childhood nonfluencies, Con- 
dition 2. 

ment could be made that differences are 
unimportant (t-test results range from 
.53 to 1.84). 

Over the 10 levels of the nonfluency 
scale, the number of diagnoses of stut- 
tering was, in general, a function of the 
amount of nonfluency in the sample 
passages. Johnson (11) reported that 
more than 90% of the parents of stut- 
terers in his study misdiagnosed normal 
nonfluency as stuttering. Figure 1 shows 
what would have happened in the pres- 
ent experiment if more than 90% of the 
parents of stuttering children had diag- 
nosed stuttering at Level 4, and retained 
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this diagnosis for all samples more se- 
vere in nonfluency than that at Level 
4. Those parents of children who devel- 
oped normal speech would not have 
been expected to be so overwhelmingly 
intolerant of the speech sample at Level 
4 but would be expected to make the 
largest number of diagnoses of stutter- 
ing at Levels 9 and 10. Figures 2 and 3 
present the diagnoses for Conditions 1 
and 2 as they actually occurred. 


From Figure 2 it can be seen that 
under Condition 1 the number of diag- 
noses made by all the parent groups 
seemed to be a function of the amount 
of nonfluency written into the passages. 
The parents of children who stuttered 
did not make an unusually high number 
of diagnoses of stuttering in response 
to normal nonfluency at any level. It 
appears, however, that normal nonflu- 
ency was being misdiagnosed as stutter- 
ing by a large number of parents in all 
the groups. At Levels 4 (mean), 5, and 
6, approximately 10%, 20%, and 30%, 
respectively, of each parent group diag- 
nosed stuttering’ It was observed that 
many parents misdiagnosed normal non- 
fluency as stuttering with apparently 
little relation to the type of speech 
which their own children exhibited." It 
was also noted that some parents in all 
groups did not label the recordings of 
stuttering children as stuttering. Most 
of these parents used words like ‘re- 
peated excessively,’ ‘hesitated too much,’ 
‘stumbled over himself,’ to describe the 
passages previously agreed upon as 
‘stuttering.” 


From Figure 3 it can be seen that 
under Condition 2 the number of diag- 
noses made by parent groups followed 
a trend similar to that under Condition 
1. The number of diagnoses for each 
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Figure 4. Comparison of tolerance of parents 
of stutterers for the nonfluencies of child- 
hood under Condition 1 and Condition 2. 


group increases, in general, with in- 
creasing amounts of nonfluencies writ- 
ten into the passages. The over-all dif- 
ferences between groups are small. At 
some levels, most particularly at Levels 
2 and 5, parents of stutterers heard 
more stuttering than did the parents of 
normal speaking children. These two 
differences are significant, the first be- 
yond the 1% level, the second beyond 
the 5% level (¢ = 3.22, 2.54, with 
122df). This evidence, however, is not 
enough to justify a conclusion that 
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Figure 5. Comparison of tolerance of moth- 
ers of normal speaking children for childhood 
nonfluencies under Condition 1 and Condition 
2 


there is any real difference between 
these two groups of parents under this 
condition. Comparisons of groups of 
fathers, groups of mothers, and intra- 
group comparisons of fathers and 
mothers provided no evidence of sig- 
nificant differences. 

Figure 4 shows the change between 
the two conditions exhibited by parents 
of stutterers which was typical of all 
the other groups of parents. Figure 5 
shows the relatively consistent perform- 
ance of the mothers of normal speaking 
children from condition to condition. 


Discussion 


The Bloodstein, Jaeger, and Tureen 
study (2) sampled the responses of par- 
ents to nonfluencies under conditions 
analogous to Condition 2 of the present 
study. Here it was found that all parent 
groups except mothers of normal speak- 
ing children diagnosed more stuttering 
when the word ‘stuttering’ was included 
in the instructions than when it was not. 
Previously reported differences between 
parents of stutterers and parents of non- 
stuttering children regarding tolerance 
for nonfluency must take into consider- 
ation the effect on listeners of including 
the word ‘stuttering’ in their instruc- 
tions. 


Furthermore, parental intolerance of 
nonfluency is not the only factor which 
Johnson considers in the development 
of stuttering. He states (8, p. 238) that 
three interacting variables contribute 
to the onset of stuttering: the listener’s 
sensitivity to the speaker’s nonfluency, 
the speaker’s degree of nonfluency as 


‘objectively determined, and the speak- 


er’s sensitivity to his own nonfluency. 
However, he concludes later (8, p. 241) 
that ‘. . . at the point of origin of the 
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problem of stuttering the most crucial 
single factor to be considered is that of 
the listener’s sensitivity to the speaker's 
nonfluencies, his inclination to evaluate 
them as undesirable and distressing, and 
particularly his tendency to classify 
these specifically as stuttering.’ 


' The present study has objectively 
predetermined the speaker’s nonflu- 
encies and found that at the time and 


under the conditions of the investiga- _. 


tion parents of stuttering children were 
no more sensitive to nonfluencies than 
parents of children who did not develop 
stuttering. It is a weakness of this study 
that parents of stuttering children were 
not studied when their children were 
developing speech rhythm problems.’ It 
is possible, however, that persons other 
than the parents of these children, per- 
haps teachers, relatives, or contempo- 


raries, could have been responsible ini-- 


tially for exposing the stuttering chil- 
dren to high fluency standards (6). 


Perhaps the child’s sensitivities to his 
own fluency (or nonfluency) should 
be studied further. Glasner and Rosen- 
thal (7) report that parents’ attempts 
to reduce nonfluencies by suggestions 
such as, ‘think before you speak,’ ‘take 
a deep breath,’ ‘slow down,’ were made 
to many children who did not retain 
excessively nonfluent speech. General 
semantic theory (J2, p. 19) on which 
Johnson based his early thinking of 
stuttering (10, p. 432) suggests that how 
a child perceives and internalizes the 
intent of listener reactions to nonflu- 
encies is a catalytic factor which may 
determine the impact upon a child of 
a misdiagnosis of nonfluency. 


The two factors which exerted the 
most influence on the amount of stut- 
tering diagnosed by the listeners in this 


| 


study were (a) the amount of intrinsic 
nonfluencies in the samples, (b) the 
wording of instructions, that is, whether 
or not the listeners were instructed to 
listen specifically for stuttering. This 
latter finding supports the results of 
Boehmler (3) and Tuthill (16). 
Mothers of normal speaking children 
were the only group who showed a 
stable assessment of nonfluencies under 
hoth experimental conditions of this 
study. 


Summary 


Contrived and controlled samples of 
nonfluent speech were presented under 
two conditions to 67 parents of children 
who stutter, 86 parents of children with 
misarticulations, and 57 parents of nor- 
mal speaking children. In Condition 1 
the parent was asked if his child’s speech 
caused concern, and if so, what did the 
child do that caused concern; in Con- 
dition 2 the parent was asked if the 
child stuttered. The samples were con- 
structed to represent a continuum of 
nonfluency from —3 to +4 standard 
deviations of nonfluency into stuttering. 
Construction of controlled samples re- 
sembling candid speech was found feasi- 
ble. 

Mothers of normal speaking children 
did not change their diagnoses signifi- 
cantly from Condition 1 to Condition 
2, but all other parents diagnosed signif- 
icantly more stuttering in Condition 2 
than in Condition 1. It appeared that 
the wording of instructions was a fac- 
tor. Parents of stuttering children were 
not unusually intolerant of nonfluency 
compared to the other parents since 
members of all parent groups misdiag- 
nosed some normal nonfluency as stut- 
tering. 
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Scaling Defectiveness of Articulation 


by Direct Magnitude-Estimation 


ELIZABETH MOODIE PRATHER 


Various psychological and _psycho- 
physical scaling techniques have been 
used in experimentation in speech path- 
ology, usually to gain specific informa- 
tion about certain aspects of speech. 
Studies by Sherman and Goodwin (5), 
Powers (3), and Weiss (0) are repre- 
sentative. In another group of studies, 
among them those by Sherman (4) and 
Morrison (2), emphasis has been on 
establishing practical and useful appli- 
cations of one particular scaling method. 
Only one study, reported by Sherman 
and Moodie (6), has been designed to 
compare several different scaling meth- 
ods in order to evaluate their usefulness 
for scaling defectiveness of articulation. 

The search for useful psychological 
scaling techniques has been carried on 
to a greater extent in fields other than 
speech pathology. The questions raised 
in such a search and in the evaluation of 
any one scaling method, however, can- 
not be answered easily from the results 
obtained in other fields, such as in 
psychology. The ‘degree of defective- 
ness of speech’ is a qualitatively dif- 
ferent continuum from ‘heaviness,’ 
‘length,’ ‘duration,’ ‘handwriting ex- 
pertness,’ or other psychological at- 
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tributes often used by psychologists to 
test theoretical models of scaling pro- 
cedures.Qn several ways defectiveness 
of speech samples (usually tape re- 
corded) is more difficult to handle in 
experimental situations than the above 
mentioned attributes. ) Tape-recorded 
speech samples must be listened to one 
at a time. Repetitions for observers of 
certain samples or pairs of samples can- 
not be easily accomplished.\Although 
the results obtained from using two or 
more procedures with the same samples 
can be compared, no physical or out- 
side measure is available with which to 
compare the psychological attribute, 
degree of defectivenesss Usefulness of 
the various scaling techniques, there- 
fore, must be determined in terms of 
their reliability, the variability of judg- 
ments obtained, and possible linear re- 
lationships among the scales resulting 
from the procedures used. 

The possible uses of a precise and 
convenient scaling procedure are many. 
Such a procedure could be used to ob- 
tain criterion measures for evaluating 
different types of speech therapy, to 
determine the extent to which severity 
of a speech deviation relates to some 
other variable under study, or to con- 
struct ‘severity scales’ of defectiveness 
as done by Morrison (2). The scales 
could be used for the training of clini- 
cians and experimental personnel in the 
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reliable rating of some aspect of speech 
and for the training of the individual 
with a speech problem in judging his 
own variability from situation to situa- 
tion. 


In the area of speech pathology, rela- 
tively little investigation has been di- 
rected toward obtaining scale values 
in terms of ratios. A ratio scale, com- 
pared to widely used interval scales, 
has the advantage of an absolute zero, 
a feature which permits use of ratios 
,of scale numbers in all numerical and 
statistical operations, which in turn 
.makes possible more meaningful inter- 
pretations of the results. In the Sherman 
and Moodie study (6), results of ratio 
scaling by the method of constant sums 
were considerably different from the 
results derived by the other three 
scaling procedures: equal-appearing in- 
tervals, successive intervals, and pair 
comparisons. The distribution with 
clustering of scale values at the extremes 
of the scale and large gaps through the 
middle of the range resulting from the 
method of constant sums could not be 
meaningfully interpreted. There was 
thus no evidence that the method could 
be useful for scaling defectiveness of ar- 
ticulation. Further experimentation with 
an additional ratio scaling method might, 
however, contribute information in re- 
gard to obtaining a ratio scale useful 
for such evaluations. 


In the same study (6) the method of 
equal-appearing intervals seemed to be 
the most useful of the four methods 
compared for scaling defectiveness of 
articulation. Scale values obtained were 
reliable, relatively easy to compute, and 
in very close agreement with the inter- 
nally consistent scale values obtained by 
the method of successive intervals. The 
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method itself, however, has certain in- 
herent weaknesses which prohibit some 
investigators from accepting it as the 
most desirable method available for 
scaling defectiveness of articulation. By 
this method all observers are forced to 
use an absolute scale; individual or group 
observer biases cannot be corrected 
and removed. The presence of an end 
effect, that is, a piling of judgments in 
the end intervals, is another weakness. 
Even if all of the above limitations 
could be eliminated, the method would 
still yield an interval scale which lacks 
an absolute zero. 


The question arises as to whether a 
ratio scaling method can be found 
which is vellible. relatively simple in 
its application, and useful for evalu: ating 
defectiveness of articulation. Many 
scaling methods which result in ratio 
scales are impossible to use with tape- 
recorded speech samples, or they are so 
laborious for the observers and experi- 
menters that they are impracticable. 
Stevens (7) and other psychophysicists 
(8, 9) at the Harvard Acoustic Labora- 
tory have been experimenting recently 
with the method of direct magnitude- 
estimation in scaling the psychological 
attributes of physical stimuli such as 
loudness, pitch, brightness, and finger 
span. The method appears to be easily 
employed and the obtained results seem 
reliable. 


This method, which is used in the’ 
present study, involves presenting stim- 
uli one at a time to a group of observers. 
The experimenter may assign a number 
to the first stimulus, which is to be used 
as the standard. (The stimulus in the 
present experiment is a speech sample.) 
For the succeeding samples the observer 
attempts to assign numbers proportional 
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to the standard along the continuum of 
measurement (in the present experi- 
ment, severity of articulation defective- 
ness). The geometric mean, median, or 
arithmetic mean value assigned by the 
observers becomes the scale value for 
a stimulus. The experimenter may, on 
the other hand, simply present the stim- 
uli to the observer one at a time and 
ask him to assign whatever numbers 
represent the relative position of each 
stimulus on the continuum. In this ap- 
plication of the method, the first stim- 
ulus becomes the observer’s standard. 
He is free to assign to it whatever num- 
ber he chooses. The only restriction 
placed on him is that he is required on 
any succeeding stimulus to use a num- 
ber which will indicate the position of 
that stimulus on the continuum relative 
to the standard with its designated 
number. (In the present study, for in- 
stance, if he were to assign the number 
12 to the first sample and he judged the 
second sample to be twice as severe in 
articulation defectiveness he would 
have to assign the number 24 to that 
sample.) The scale values are computed 
by bringing the individual observer 
estimates iaile for a given sample into 
coincidence by multiplying by an ap- 
propriate factor. Constant corrections 
are then made for each observer and an 
index of central tendency of the cor- 
rected estimates becomes the scale value 
for each sample. This particular appli- 
cation of the method permits each ob- 
server freedom to use whatever size 
scale he chooses; ideally, the assigned 
values would be linear from observer 
to observer and for this reason observer 
biases can be removed. 

Systematic differences in obtained 
scale values may be the result of small 


variations in the conditions under which 
the method of direct magnitude-esti- 
mation is used. Sources of bias may 
possibly be introduced by the use of a 
standard. For example, Stevens (7), in 
experimenting with loudness judgments 
of pure tones, varied the number as- 
signed to the standard stimulus and the 
intensity level of the standard stimulus. 
He found least variability of assigned 
values among observers when the stand- 
ard fell near the middle of the inten- 
sity range. As the ratio of the standard 
to the variable stimuli increased, vari- 
ability of judgments increased. This 
variability was determined by an ex- 
amination of the interquartile ranges 
of judgments around the median scale 
values. He suggested the use of a num- 
ber such as 10 or 100 which is easily 
divided and multiplied. He stressed 
the importance of assigning a number 
only to the standard stimulus, thus leav- 
ing the observer completely free to 
decide what number he will assign to 
the variable. If the experimenter assigns 


numbers to more than one stimulus, he 


introduces constraints of the sort that 
force the observer to make judgments 
on an interval rather than a ratio scale. 
The question arises as to whether scale 
values may differ systematically de- 
pending upon the number of times the 
observer can refer to the standard stim- 
ulus. 

Stevens (7) also has investigated re- 
sults in the case where no specific point 
assignment is designated for the stand- 
ard. He found very close agreement 
in loudness judgments between the two 
conditions with and without a desig- 
nated standard. He reports that his ob- 
servers tended to feel less certain about 
their estimates when no specific point 
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assignment was set for them, but the 
obtained scale values did not differ sig- 
nificantly between the two conditions. 

The major purpose of the present 
study was to evaluate the usefulness of 
the method of direct magnitude-esti- 
mation for scaling defectiveness of ar- 
ticulation. Specifically, the following 
questions were asked: 


(a) Can the method of direct magni- 
tude-estimation be used to scale de- 
fectiveness of articulation reliably? 
(b) Is the method practical in terms 
of experimenter and observer time? 
(c) Do obtained scale values differ 
depending upon presence or absence 
of a designated standard? 

(d) Do obtained scale values differ 
systematically when a standard of 
medium severity is used as opposed 
to a standard of mild severity? 

(e) Do obtained scale values differ 
systemically when the number as- 
signed to the standard stimulus is 
varied? 

(f) Do obtained scale values differ 
systematically when the standard is 
presented before every sixth stimulus 
as opposed to when the standard is 
presented only at the beginning of 
the experimental stimuli? 


Procedure 


- Test Items for Scaling. The experi- 
mental material consisted of 27 five- 
second segments from the continuous 
speech of children between the ages of 
five.and 10 years selected to represent 
a range of articulation from normal to 
severely defective. The same material 
was used in the Sherman and. Moodie 
study (6). The material was originally 
selected from the four ‘ severity “scale? 
constructed by Morrison (2). 


Equipment. Original speech sample 
recordings were made in conditions of 
quiet on a Magnecord tape recorder, 
Model PT-6V, with an Altec, Model 
21C, condenser microphone. Tape speed 
was 15 inches per second. The portions 
of the original tapes that contained the 
above mentioned samples were dubbed 
at the same speed onto new magnetic 
tape on a Presto recorder, Model RC- 
10/24. All listening sessions were held in 
a sound-treated room. An Ampex, 
Mode! 350C, followed by a power am- 
plifier was used for the playbacks. 


Observers. Observers were 200 stu- 
dents from the University of Iowa cur- 
rently enrolled in an elementary psy- 
chology course. They were subdivided 
into five groups of approximately 40 
observers. Each of four groups partici- 
pated in one of the experimental con- 
ditions described below. The fifth 
group participated in two conditions to 
meet the requirements of the experi- 
mental design. Not more than 22 ob- 
servers took part in any one listening 
session. This provision was made Sau 
the purpose of insuring control of the 
listening procedure. 


Experimental Conditions. Six experi- 
mental conditions were designed to 
compare scale values derived from data 
obtained under differing instructions. 
The conditions were planned to study 
the effects of (a) a standard stimulus of 
medium severity as compared to a 
standard stimulus of mild severity, (b) 
a standard stimulus of medium severity 
designated as 10 as compared to the 
same standard stimulus designated as 
100, (c) a standard stimulus presented 
before every sixth segment as compared 
to a standard stimulus presented only 
at the beginning of the stimuli, and 
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(d) a standard stimulus with a desig- 
nated number of points as compared to 
a standard stimulus with no specific 
point assignment. 


Specifically, the six conditions were 
as follows: Condition I, standard of 
medium severity designated as 100, 
standard presented only at beginning of 
stimuli; Condition II, standard of medi- 
um severity designated as 10, standard 
presented only at beginning of stimuli, 
Condition III, standard of medium se- 
verity designated as 100, standard re- 
peated before every sixth speech 
segment; Condition IV, standard of me- 
dium severity (same standard stimulus 
as for Conditions I, II, and III) desig- 
nated by no specific point assignment, 
observers free to designate whatever 
number desired; Condition V, same as 
Condition I, with observers who had 
taken part in Condition IV exactly one 
week earlier; Condition VI, standard 
of mild severity designated as 10, stand- 
ard presented only at beginning of 
stimuli. 


Under each condition the observers 
heard and rated the 27 five-second seg- 
ments four times, all within one 50- 
minute listening session. The four trials 
were employed to provide (a) for 
comparison of the effects, if any, of 
several sequences and (b) for an evalu- 
ation of the effect of practice, if any. 
Three different random sequences of 
the 27 segments had been tape re- 
corded: Tape A, Tape B, and Tape C. 
The first three trials for each condition 
consisted of these three tapes; the 
fourth trial was a repeat of the tape 
employed in the first trial. For Condi- 
tions I, IV, V, and VI the sequences 
were A, B, C, and A, administered for 
the four trials in the order given. For 


Condition II the sequences were B, C, 
A, and B, and for Condition III, C, A, 
B, and C, in the orders given. 


Results and Discussion 


Obtaining Scale Values. In the meth- 
od of direct magnitude-estimation, in- 
vestigators have used differing computa- 
tional procedures to derive scale values. 
Medians, arithmetic means, and geo- 
metric means have each been used as 
the measure of central tendency. In 
the present investigation the 27 speech 
segments had been rated on four trials 
in each of six conditions for a total of 
648 obtainable scale values of defective- 
ness of articulation. Preliminary to 
selection of a measure of central tend- 
ency, 60 sets of judgments were ran- 
domly selected and 60 scale values were 
computed by each of the three pro- 
cedures. 


For several reasons comparisons in- 
dicated the use of arithmetic means as 
the procedure of choice in this investi- 
gation.” In order to use the median the 
experimenter is required in some in- 
stances to make qualitative judgments 
regarding its location. When the median 
is included within a group of identical 
responses its selection is to some extent 
dependent upon the size of the inter- 
val decided upon by the experimenter, 
and, for practical reasons, the use of 
arbitrarily chosen intervals is necessary 
in computation of medians. The 60 
medians differed, but not in any system- 
atic fashion, both from the correspond- 


*Stevens (7) used medians as the index of 
central tendency. He suggested that there are 
always a few observers who deviate far from 
those of the majority, but the behavior of the 
typical (median) individual is usually quite 
consistent. 























Direct Magnitude-Estimation Scaling: Prather 385 


ing arithmetic means and from the 
corresponding geometric means. The 
60 geometric means differed consis- 
tently from the 60 arithmetic means; 
they were, in general, approximately 
10 points lower than the corresponding 
arithmetic means. The assumption was 
thus made that the use of geometric 
means rather than arithmetic means 
would be of little, if any, advan- 
tage in making comparisons among con- 
ditions. For these reasons arithmetic 
means were selected as the most useful 
and practical measure of central tend- 
ency for accomplishing the purposes of 
this study. 

Practice and Sequence Effects. Three 
random sequences were employed 
under each of the six conditions, with 
the sequence of Trial 1 repeated in 
Trial 4. The use of several sequences 
and the repetition of one sequence were 
to provide for determining whether 
practice and sequence had any impor- 
tant effect upon obtained scale values. 
Intraclass correlation procedures (J), 
utilizing the analysis of variance tech- 
nique of a treatments-by-subjects de- 
sign and allowing for a trend analysis 
of the data, were employed to obtain 
reliability coefficients measuring the re- 
liability both of individual scale values 
for four trials, designated as 7,,? and 
of averages over trials, designated as r,.° 
The between-trials variance was signifi- 
cant at or beyond the 5% level for five 
of the six conditions, indicating that 
general level of rating was not the same 
for all trials within a condition. Exam- 
ination of the means, however, revealed 
no consistent pattern of trend over trials 
from one condition to another, and 


*r1 = (mss — mSo)/(mss — 3S). 
°r. = (MSs — MSwo)/mMSsz. 


Taste 1. Intraclass correlation coefficients for 
evaluating reliability of individual scale values 
for 27 speech segments for four trials, r:, and 
of averages over trials, 72. 
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these significant trends cannot be attrib- 
uted either to practice or to sequence. 
Furthermore, the trend differences are 
very small, the largest difference being 
10 points between Trials 2 and 4 of 
Condition IV on a scale of approxi- 
mately 140 points. The between-trials 
variance, even when significant, was not 
removed from the error term, and the 
six obtained 7,s are thus minimum esti- 
mates. The results of the two analyses 
for each of the six conditions are shown 
in Table 1. The 7,;s range from .94 to 
.98 and the r.s from .98 to .996. The 
consistently high 1,s provide strong 
evidence that neither sequence of pres- 
entation nor practice effects had any 
important influence upon the obtained 
scale values. In other words, the seg- 
ment-by-trial interaction is negligible. 
In view of this conclusion, averages of 
scale values over four trials for each of 
the 27 segments (see Table 2) were 
employed for making comparisons 
among the six conditions. That these 
mean scale values are highly reliable is 
indicated by the very high r,s. 


Comparisons with Interval Scaling 
Results. The experimental samples had 
been scaled by the method of equal- 
appearing intervals in another study 
(6). To compare the ratio scale values 
with the interval scale values, linear 
transformations were made between the 
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Taste 2. Scale values of articulation defectiveness for the 27 experimental speech segments for 
each of the six conditions obtained by averaging across the four trials of each condition. 











Segment Conditions 
Number I ll Ill IV V VI 
i 48.70 5.65 66.45 60.15 50.82 10.00 
2 57.15 6.60 64.96 68.47 51.26 10.36 
3 67.87 8.12 87.30 71.90 62.24 12.70 
4 39.65 5.33 50.74 57.79 42.07 8.88 
5 68.68 6.91 68.23 67.95 64.33 11.12 
6 66.94 7.49 60.59 76.77 73.29 11.19 
7 58.20 6.57 73.14 72.01 60.41 10.47 
8 78.30 9.33 74.10 83.78 73.39 12.61 
9 68.48 7.22 76.48 75.21 53.75 12.68 
10 96.58 10.91 96.48 105.82 88.60 16.44 
il 84.90 9.72 91.16 87.38 71.07 15.19 
12 67.19 7.62 70.18 73.58 66.43 12.19 
13 100.00 10.00 100.00 100.00 100.00 17.17 
14 84.35 10.12 95.09 87.87 65.75 13.72 
15 117.98 13.15 135.74 111.37 103.04 21.37 
16 87.86 10.17 86.97 101.32 91.20 13.73 
17 140.72 16.81 129.38 123.00 119.91 20.72 
18 137.87 16.36 122.27 142.08 140.26 22.85 
19 161.79 17.88 146.20 144.69 139.91 25.34 
20 187.51 22.55 162.58 166.45 164.79 28.21 
21 130.08 15.23 127.96 140.87 122.39 24.35 
22 231.31 24.25 181.31 181.13 198.25 30.45 
23 147.22 18.10 141.95 138.07 132.36 24.00 
24 180.48 20.15 134.69 162.08 157.71 25.79 
25 209.93 24.76 189.96 180.03 189.39 31.53 
26 248.38 26.60 192.11 187.17 196.85 31.65 
27 226.26 24.42 165.49 179.65 199.39 29.95 








scale values obtained by the method of so that both the mean and standard 
equal-appearing intervals and those ob- deviation equaled the corresponding in- 
tained by the method of direct magni- dexes of the direct magnitude-estima- 
tude-estimation. The equal-appearing tion values. The transformations were 
intervals values, thus, were transformed made separately for each of the six sets 
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Ficure 1. Transformed scale values obtained 
by the method of equal-appearing intervals 
(EAI) in the previous study (6) plotted 
against the corresponding scale values ob- 
tained by the six conditions of the method 
of direct magnitude-estimation in the present 
study. 


of direct magnitude-estimation scale 
values. The values were then plotted, as 
shown in Figure 1, with equal-appear- 
ing intervals scale values represented 
along the ordinate and the direct mag- 
nitude-estimation scale values along the 
abscissa. Also given in Figure 1 are Pear- 
son 1s, regression equations computed 
by the method of least squares, and 
standard errors of estimate for each of 
the six comparisons. 


The relationships do not depart from 
linearity as might have been expected 
from results of the earlier study (6) 
which indicated a curvilinear relation- 
ship between interval scale values and 
ratio scale values. The correlation co- 
efficients estimating strength of relation- 


ship between sets of scale values ranged 
from .94 to .97. The standard errors of 
estimate shown are, of course, relative 
to the size of the scale used and cannot 
be compared on an absolute basis. These 
results, in total, show that the scale 
values obtained by the method of direct 
magnitude-estimation are in very close 
agreement with the scale values ob- 
tained by the method of equal-appear- 
ing intervals. 


Of the six sets of scale values, the set 
in closest agreement with the equal- 
appearing intervals scale values was ob- 
tained under Condition IV, the only 
condition in this investigation in which 
no specific point assignment was made 
for the standard segment. The obtained 
correlation coefficient is .97 and the 
standard error of estimate, 10.26 on a 
140-point scale, is the smallest (relative 
to the scale) of the six. Correlation be- 
tween the scale values of Condition III, 
the only condition in which the stand- 
ard was repeated occasionally through- 
out the presentation of the stimuli, and 
the equal-appearing intervals scale val- 
ues is .94; this is the only correlation co- 
efficient that falls below .96 and it has 
the largest standard error, 14.74 on a 
140-point scale, of the six. Relationship 
between the scale values obtained by 
the two methods thus cannot be said 
to differ importantly from one condi- 
tion to another. 


The high correlation coefficients ob- 
tained and the linearity of the relation- 
ships in the above comparisons show 
basically very close agreement between 
the scale values obtained by the meth- 
ods of direct magnitude-estimation and 
equal-appearing intervals. If it is as- 
sumed that the method of direct mag- 
nitude-estimation yields a true ratio 
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scale, the closeness and the ast of 
the relationships demonstrated here pro- 
vide some evidence that, in the scaling 
of defectiveness of articulation, the 
limitations of the method of equal-ap- 
pearing intervals may not be important. 
These limitations, or so-called inherent 
weaknesses, as cited earlier, are the 
presence of an end effect, the failure 
to remove observer biases, and the re- 
sultant interval rather than ratio scale. 
The method of direct magnitude-esti- 
- mation appears to be a very useful tech- 
nique, especially when a ratio scale is 
desired or when the experimenter has 
reason to refrain from setting an abso- 
lute scale in which all observers must 
place their judgments. 


Comparisons of the methods of direct 
magnitude-estimation and successive in- 
tervals were not made. In the previous 
study (6) the scale values obtained by 
the method of successive intervals were 
in extremely close agreement with those 
obtained by the method of equal-ap- 
pearing intervals. It was assumed in the 
present study that the comparisons of 
the present scale values with those ob- 
tained by the method of successive in- 
tervals would essentially duplicate the 
above interval scale comparisons. 


Comparison with other Ratio Scaling 
Results. In the previous study (6), 
scale values obtained by the method of 
constant sums, the only method of ratio 
scaling included, were difficult to inter- 
pret. The values were curvilinearly re- 
lated to the values obtained by the 
methods of equal-appearing intervals, 
of successive intervals, and of pair 
comparisons. The constant-sums values 
clustered at the extremes of the scale 
with large gaps through the middle of 
the range. Because the direct magni- 
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Figure 2. Scales values obtained by the six 
conditions of the method of direct magni- 
tude-estimation in the present study plotted 
against the corresponding scale values ob- 
tained by the method of constant sums (CS) 
in the previous study (6). 


tude-estimation scale values are linearly 
related to and in very close agreement 
with the equal-appearing intervals 
values, it might be expected that the 
relationship between the direct magni- 
tude-estimation values and the constant- 
sums values might depart from linearity. 
The scale values of each of the six con- 
ditions were plotted against the scale 
values previously obtained by the meth- 
od of constant sums and are shown in 
Figure 2. Contrary to expectation the 
relationships do not appear to be clearly 
curvilinear in nature, and the correla- 
tion coefficients (see Figure 2) range 
from .94 to .96, indicating close rank 
order agreement. The coefficients are 
only slightly lower than those for com- 
parison with equal-appearing intervals 
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results. The deviations of the scale 
values obtained by the method of con- 
stant sums from the other three sets of 
values of the previous study, however, 
perhaps make their usefulness question- 
able. 

Comparisons with Pair Comparisons 
Scale Values. Correlation coefficients 
for estimating the relationships between 
the six sets of scale values obtained by 
the method of direct magnitude-estima- 
tion and the scale values obtained by 
the method of pair comparisons in the 
previous study (6) ranged from .91 for 
Condition III to .97 for Condition IV. 
The other four correlation coefficients 
were each .96. Close agreement is thus 
indicated for each of the six conditions. 
In the previous study the obtained cor- 
relation coefficient of .96 indicated a 
close relationship between the scale 
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Ficure 3. Trend of severity for each of six 
conditions with experimental speech seg- 
ments arranged in order from least to most 
severe by the previous (6) equal-appearing 
intervals results. Scale values are plotted 
for every third segment. 
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values obtained by the methods of pair 
comparisons and equal-appearing inter- 
vals. Despite this high correlation, the 
scale values obtained by the method of 
pair comparisons were considered to be 
unsatisfactory because of their demon- 
strated lack of internal consistency. 


Comparisons among the Six Sets of 
Scale Values Obtained by the Method 
of Direct Magnitude-Estimation. It was 
expected that scale values obtained 
under the six conditions of this study 
would vary from one condition to an- 
other. Whether these differences would 
be systematic or attributable directly 
to the specific instructions involved for 
each condition was, of course, not 
known. Initial comparisons were made 
by simply plotting the six obtained 
scale values for each of the 27 segments, 
arranging these segments from least to 
most severe according to the scale 
values obtained by the previous equal- 
appearing intervals results. Because of 
its complexity the entire plot is not 
reproduced here. Every third segment 
(ordered from least to most severe by 
the equal-appearing intervals results) 
is represented in Figure 3. From an ex- 
amination of this figure it is apparent 
that the values for approximately two- 
thirds of the segments are in very close 
agreement but that the scale values for 
Conditions I and II deviate from the 
scale values of the other conditions at 
the more severe end of the scale. Here 
the difference can be seen to be mainly 
one of absolute value, not of rank 
order. In Conditions I and II, especially 
Condition II, the observers simply ex- 
tended the scale and rated the more 
severe segments relatively higher than 
the observers who rated under the 
other four conditions. For Conditions I 
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Ficure 4. Scale values of articulation defectiveness for 27 speech segments obtained under six 
conditions plotted to show all possible comparisons among the six conditions. 


and II identical instructions were used 
except that in Condition I the standard 
was assigned 100 points and in Condi- 
tion II it was assigned 10 points. The 
standard was a segment of medium 
severity presented only at the beginning 
of the experimental speech segments. 
Why observers overestimated at the 
upper extreme in these two conditions, 
or underestimated in the other four 
conditions, is not understood. Because 
the other four sets of scale values ap- 
pear to be in close agreement through- 
out the range, the experimenter chooses 
to assume that the observers in Condi- 
tions I and II overevaluated the more 
severe segments. 

The next comparisons among the six 
sets of scale values also were done 
graphically. The scale values of each 
condition were paired with those of 
every other condition to study linearity 
and agreement. The 15 plots are shown 
in Figure 4. Certainly linearity prevails 
in all 15 comparisons, but agreement ap- 
pears to vary somewhat as judged by 


the scatter seen in the plots. Pearson 
correlation coefficients, reported along 
with each plot, range from .96 to .99, 
not low enough nor different enough 
from each other for statistical tests to 
discriminate among the 15 possible pair- 
ings of the six conditions. The informa- 


‘tion presented in Figure 4 indicates that, 


with the exception of the extended 
scales for Conditions I and II, systematic 
differences among the six sets of scale 
values were not obtained. 

For Conditions I and V identical in- 
structions were given. The observers 
used in Condition V, however, had pre- 
viously judged the segments under Con- 
dition IV. The obtained correlation 
coefficient of .99 is indicative of very 
close rank order agreement. As dis- 
cussed above, however, the absolute 
values assigned vary considerably at 
the upper end of the scale. The range 
of scale values obtained by Condition 
I is 39.65 to 248.38 and by Condition 
V is 42.07 to 199.39. The intraclass 
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correlations presented earlier in this 
chapter provide evidence that the four 
trial sets of scale values of Condition V 
are in somewhat closer agreement than 
those of any other condition. Perhaps 
the additional practice that the ob- 
servers had for Condition V accounts 
for the slightly increased reliability. 
If so, this could be interpreted as lend- 
ing support to the earlier assumption 
that the scale values obtained by Con- 
dition I (and by Condition II) are 
overestimated at the upper end of the 
scale. 


High agreement (r = .99) was found 
between the scale values obtained by 
Conditions IV and V. In Condition IV 
the observers were free to assign what- 
ever number desired to the standard 
segment and thus to use whatever size 
scale they wanted. In Condition V these 
same observers returned one week later 
and judged the segments with the same 
standard stimulus designated as 100 
points. In their first and free choice 
task, 35 of the 40 observers used num- 
bers between 10 and 25 for the standard 
segment. When a much higher number 
and larger scale were assigned, the 
group judgments were consistent with 
the previous judgments. The obtained 
correlation coefficients (.98, .99, .97, .99, 
and .99), estimating strength of relation- 
ship between the Condition IV scale 
values and the other five sets of scale 
values, were high and the lack of de- 
parture from linearity for each com- 
parison is obvious. The application of 
the method in which observers are free 
to use whatever size scale desired is thus 
satisfactory, and perhaps the most satis- 
factory of those presently investigated. 
It is the only one which provides for 
removal of observer biases and in this 
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case responses to the stimuli are appar- 
ently satisfactorily linear from observer 
to observer to accomplish this purpose. 


The scale values of Condition III, in 
which the standard segment was re- 
peated before every sixth segment, do 
not appear to differ in any substantial 
way from the scale values of Conditions 
IV. V, and VI. Since the repetition of 
the standard segment in Condition III 
does not appear to add reliability, and 
the results are in close agreement with 
the results obtained by other methods 
of presenting the standard stimulus, its 
further use is to be questioned. The 
additional time required to repeat the 
standard probably is not justified. 

In Condition VI the segment used 
for the standard was one of mild 
severity rather than the one of medium 
severity used in all of the other condi- 
tions. Stevens (7) has indicated that 
variability of observer judgments, as 
measured by the interquartile range, 
increases as the difference between the 
standard segment and the variable seg- 
ments increases. Standard deviations 
for individual scale values were not 
computed in this study. Reliability of 
the scale values of the four trials of 
Condition VI, however, if evaluated by 
the obtained intraclass correlations, is 
as satisfactory for Condition VI as for 
the scale values of the other five con- 
ditions. Correlation coefficients estimat- 
ing strength of relationship between 
resultant scale values of Condition VI 
and those of all other conditions in- 
dicate good agreement. 


In summary, scale values do not de- 
pend upon whether a specific point 
assignment is given to the standard 
stimulus by the experimenter or the 
point assignment is left to the free 
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choice of the observer. Results do vary, 
however, depending upon changes in 
the number of specific points assigned 
the standard stimulus by the experi- 
menter. The scale is relatively extended 
at the upper end with the assignment 
of 10 points to the standard stimulus as 
compared to the assignment of 100 
points to the standard stimulus. Scale 
values do not depend importantly upon 
variation of the severity of the standard 
stimulus. Apparently there is no im- 
portant advantage in frequent presenta- 
tion of the standard stimulus over a 
single presentation at the beginning of 
a judging session. On the basis of the. 
results obtained in this investigation it 
appears that the method of direct mag- 
nitude-estimation is useful for scaling 
defectiveness of articulation. Scale 
values were found to be reliable and 
the method is practicable in terms of 
experimenter and observer time. 


Summary 


The purpose was to study the psy- 
chological scaling method of direct 
magnitude-estimation for obtaining 
measures of defectiveness of articula- 
tion along a ratio scale. Test items were 
27 tape-recorded five-second segments 
from children’s speech ranging from 
normal to severely defective in articu- 
lation. 

Scale values were derived from lis- 
tener responses obtained under six con- 
ditions which differed with respect to 
the standard stimulus and the point as- 
signment to the standard. Obtained sets 
of scale values were compared with one 


another and with corresponding sets 
obtained previously by the methods of 
equal-appearing intervals, of pair com- 
parisons, and of constant sums. Results 
indicate close correspondence between 
sets of scale values for all comparisons. 
Scale values were reliable, and the 
method was practicable in terms of 
experimenter and observer time. 
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